VHDL coding tips and tricks: 2010

Saturday, December 4, 2010

Tips for running a successful simulation in Xilinx ISim.

Though I have given enough examples for learning VHDL I didn't write much about using the software till now. In this article I will cover some basics about running your simulation in Xilinx ISim. This article will point out some basic mistakes people do when simulating their code in ISim.
For explaining, I have just used one of my earlier example in the post : Explaining testbench code using a counter design. Lets go step by step, see the images for easier understanding of the steps. Open the images in a new tab in your browser if they are not clear enough.

1)Once the coding is done( I mean both the testbench and the design to be tested) make sure you select the top entity(testbench code) in the Xilinx window as shown below. Many people just select any other file and click the compilation button.
Note down the red markings in the image below. Points to be noted are:



  • Choose View  as simulation.
  • Select the top entity i.e. the testbench. If the wrong file is selected for simulation then the waveform in ISim will be blank and you will see no waveform.
  • Double click on the Behavioral check syntax for compiling the design or for finding out any syntax errors.
  • If the above step is successful then double click on Simulate Behavioral Model. If there are syntax errors in step 3 then you may have to check your code.


2)Now ISim will open in a new window with waveforms. Note down the toolbar at the bottom. Check the below image for knowing what each button does. You can also hover your mouse over the button and they will display the function of that button.

3)Mostly the signals in the wavforms will be displayed as binary numbers or integers. But you can change this basic setting. See the image below. 


Click on the signal which you want to change the display format. Go to radix and then select the format. Some options available are Binary, Hexadecimal, octal etc. Note that depending on your code , you have to change the display format. My code was a counter, so unsigned decimal was the best format in this case.

4)Another interesting thing you can do is adding the internal signals to the waveform which is not displayed by default. By default ISim displays only the signals which are declared in the testbench code. But if there are many sub entities then you may need to see them for debugging purpose. See the image below for how to do it.


  • Go to the Instance and process names on the left side of the ISim window. 
  • Select the Instance name whose internal signals you want to observe. 
  • All the signals declared in that particular instance will be displayed on the immediate right tab now, under simulation objects.
  • Now select the signals you want to display. You can use keyboard short cuts like shift  and Ctrl for selecting multiple signal names.
  • Now drag and drop these select signals into the immediate right tab under signal Name in waveform window.
  • For updating these signal values you have to restart the simulation and run it again. 


5)You may have noticed that in ISim, all the additional signals you added in step (4) are reset when you close the ISim window. This is little bit annoying since you have to add all the internal signals again. But you need not worry about it. Follow the steps:

  • Add the required signals into the waveform as described in step 4.
  • Save the waveform file by clicking, Ctrl + . Give an appropriate name to the wave file.
  •  Now close ISim and go back to the Xilinx ISE window.
  • Right click on simulate behavioral model. 
  • Choose the option Process properties.
  • A new window will open as shown below in the image.
  • Select the check box, Use custom waveform configuration file. 
  • Choose the waveform file you just saved in the  custom waveform configuration file.







Thats it for now. Hope these explain the things better. Thanks.

Monday, November 8, 2010

Synthesis Error : Wait for statement unsupported.

   This article is written as per the request of one of my readers who had some problem trying out the code given in this website. You may have seen this error in Xilinx ISE, "Wait for statement unsupported". And you may have wondered why the error is coming even after you are sure of writing a syntactically correct VHDL code.
  Check out the below code. It is just a testbench plus design code for full adder. As you can see we have 3 input bits and we apply all the 8 combinations of inputs with a 1 ns delay between them. For applying this delay we use the "wait for" statement. This is supported by VHDL.

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

ENTITY tb IS
END tb;

ARCHITECTURE behavior OF tb IS
   --Inputs
   signal bit1 : std_logic := '0';
   signal bit2 : std_logic := '0';
   signal bit3 : std_logic := '0';
    --Outputs
   signal sum : std_logic;
   signal carry : std_logic;

BEGIN

    --Gates
    gate_inst : process
    begin
        sum <= bit1 xor bit2 xor bit3;
        carry <= (bit1 and bit2) or (bit3 and bit2) or (bit1 and bit3);
        wait for 1 ns;
    end process;   
       
   
   -- Stimulus process
   stim_proc: process
   begin       
        bit1<='0'; bit2<='0';  bit3<='0';  wait for 1 ns;
        bit1<='0'; bit2<='0';  bit3<='1';  wait for 1 ns;
        bit1<='0'; bit2<='1';  bit3<='0';  wait for 1 ns;
        bit1<='0'; bit2<='1';  bit3<='1';  wait for 1 ns;
        bit1<='1'; bit2<='0';  bit3<='0';  wait for 1 ns;
        bit1<='1'; bit2<='0';  bit3<='1';  wait for 1 ns;
        bit1<='1'; bit2<='1';  bit3<='0';  wait for 1 ns;
        bit1<='1'; bit2<='1';  bit3<='1';  wait for 1 ns;
      wait;
   end process;

END;

   Now when you synthesize this code you will get the above mentioned error. This is just because of  the simple fact that all VHDL codes which are syntactically right are need not synthesisable. Synthesis is the process of converting the logic described by your code in terms of actual digital circuit components.A statement like "wait for 1 ns" cannot be described in terms of real hardware and hence it is not synthesisable.
   You may ask why then such a keyword is available if it is not synthesisable.The reason is that there are plenty of situations where you just want to simulate( use the simulation software in your PC for verification, not running it in real FPGA hardware) the design. In those situations "wait for" statement is very useful, especially in case of testbench codes.
   Also, sometimes when you add a new source in Xilinx ISE ( i think versions after 12.1) its default property "View Association" is set as "Implementation".This makes the file invisible in the "simulation" mode in ISE. Once you change this "View association" to "All" the file will be visible under all the views. For this right click on the file name in Xilinx ISE window and click on "source properties". You can see the "View Association" option now. Now do "Behavioral Check syntax" under the "Simulation" view. The errors must have gone now.

I hope the things are clear now.

Sunday, October 31, 2010

Sequence detector using state machine in VHDL

   Some readers were asking for more examples related with state machine and some where asking for codes related with sequence detector.This article will be helpful for state machine designers and for people who try to implement sequence detector circuit in VHDL.
   I have created a state machine for non-overlapping detection of a pattern "1011" in a sequence of bits. The state machine diagram is given below for your reference.

The VHDL code for the same is given below. I have added comments for your easy understanding.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;

--Sequence detector for detecting the sequence "1011".
--Non overlapping type.
entity seq_det is
port(   clk : in std_logic;  --clock signal
        reset : in std_logic;   --reset signal
        seq : in std_logic;    --serial bit sequence   
        det_vld : out std_logic  --A '1' indicates the pattern "1011" is detected in the sequence.
        );
end seq_det;

architecture Behavioral of seq_det is

type state_type is (A,B,C,D);  --Defines the type for states in the state machine
signal state : state_type := A;  --Declare the signal with the corresponding state type.

begin

process(clk)
begin
    if( reset = '1' ) then     --resets state and output signal when reset is asserted.
        det_vld <= '0';
        state <= A;
    elsif ( rising_edge(clk) ) then   --calculates the next state based on current state and input bit.
        case state is
            when A =>   --when the current state is A.
                det_vld <= '0';
                if ( seq = '0' ) then
                    state <= A;
                else   
                    state <= B;
                end if;
            when B =>   --when the current state is B.
                if ( seq = '0' ) then
                    state <= C;
                else   
                    state <= B;
                end if;
            when C =>   --when the current state is C.
                if ( seq = '0' ) then
                    state <= A;
                else   
                    state <= D;
                end if;
            when D =>   --when the current state is D.
                if ( seq = '0' ) then
                    state <= C;
                else   
                    state <= A;
                    det_vld <= '1';   --Output is asserted when the pattern "1011" is found in the sequence.
                end if;    
            when others =>
                NULL;
        end case;
    end if;
end process;   

end Behavioral;

If you check the code you can see that in each state we go to the next state depending on the current value of inputs.So this is a mealy type state machine.
The testbench code used for testing the design is given below.It sends a sequence of bits "1101110101" to the module. The code doesnt exploit all the possible input sequences. If you want another sequence to be checked then edit the testbench code.  If it is not working as expected, let me know.

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

ENTITY blog_cg IS
END blog_cg;

ARCHITECTURE behavior OF blog_cg IS

   signal clk,reset,seq,det_vld : std_logic := '0';
   constant clk_period : time := 10 ns;

BEGIN

    -- Instantiate the Unit Under Test (UUT)
   uut: entity work.seq_det PORT MAP (
          clk => clk,
          reset => reset,
          seq => seq,
          det_vld => det_vld
        );

   -- Clock process definitions
   clk_process :process
   begin
        clk <= '0';
        wait for clk_period/2;
        clk <= '1';
        wait for clk_period/2;
   end process;

   -- Stimulus process : Apply the bits in the sequence one by one.
   stim_proc: process
   begin       
        seq <= '1';             --1
      wait for clk_period;
        seq <= '1';             --11
      wait for clk_period;
        seq <= '0';             --110
      wait for clk_period;
        seq <= '1';             --1101
      wait for clk_period;
        seq <= '1';             --11011
      wait for clk_period;
        seq <= '1';             --110111
      wait for clk_period;
        seq <= '0';             --1101110
      wait for clk_period;
        seq <= '1';             --11011101
      wait for clk_period;
        seq <= '0';             --110111010
      wait for clk_period;
        seq <= '1';             --1101110101
      wait for clk_period;
      wait;        
   end process;

END;

The simulated waveform is shown below:

Note:- The code was simulated using Xilinx 12.1 version. The results may vary slightly depending on your simulation tool.

How to Initialize a Block RAM IP in Xilinx Vivado? (With Testbench)

THIS ARTICLE WAS UPDATED on 19-04-2024. OLD ARTICLE USED XILINX ISE INSTEAD OF VIVADO.

    This article is a continuation of my earlier post, How to use Xilinx Vivado's IP Catalog to create a BRAM? (With Testbench), where I explained the steps to create a Block RAM IP in Xilinx Vivado tool.

    In there, I initialized the bram contents to zero on power up. But you can initialize it with non-zero values as well. How do we do that? It can be done with a special file, which has an extension coe.

    Let me show you how this can be done with few screenshots. Note that you should first follow the instructions from the old article to set up a BRAM as per the settings given there, before proceeding with the rest of this article.

1. In the "Other options" tab in the IP settings, check the checkbox "Load Init File". To add the coe file, click on "Edit" option. Click on "Yes" when the following dialogue box opens up.

create ceo file for bram xilinx vivado

2. Type a filename. I gave the name as "bram_contents.coe". Click on "Save".

3. The following window shows up. Enter the radix of the values. Add the list of values as shown in the screenshot. Click on "Save" and then click on "Close".

how to edit coe file in xilinx vivado

4. Now click on "Ok" and then on "Generate" to generate the IP core.

5. Now we want to test whether the bram is correctly initialized. I have written the following testbench file for this.

library ieee;
use ieee.std_logic_1164.all;

--empty entity for testbench
entity tb_bram is
end tb_bram;

architecture Behavioral of tb_bram is

--copy paste the port declaration of the bram from the xilinx generated file.
component bram_test is
  PORT (
    clka : IN STD_LOGIC;
    ena : IN STD_LOGIC;
    wea : IN STD_LOGIC_VECTOR(0 DOWNTO 0);
    addra : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
    dina : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
    douta : OUT STD_LOGIC_VECTOR(7 DOWNTO 0)
  );
end component;

--port signals for connecting with the bram
signal clka, ena : std_logic := '0';
signal wea : std_logic_vector(0 downto 0) := "0";
signal addra, dina, douta : std_logic_vector(7 downto 0) := (others => '0');
constant clk_period : time := 10 ns; --clock period.

begin

--componenent instantiation using named association.
bram : bram_test port map(
    clka => clka,
    ena => ena,
    wea => wea,
    addra => addra,
    dina => dina,
    douta => douta);

--generate the clock for bram
clk_generation: process
begin
    wait for clk_period/2;
    clka <= not clka;
end process;

--this is where we generate the inputs to apply to the bram
stimulus: process
begin
    wait for clk_period;
    ena <= '1';
    wea <= "0";
    addra <= x"00"; wait for clk_period;
    addra <= x"01"; wait for clk_period;
    addra <= x"02"; wait for clk_period;
    addra <= x"03"; wait for clk_period;
    addra <= x"04"; wait for clk_period;
    addra <= x"05"; wait for clk_period;
    addra <= x"06"; wait for clk_period;
    wait;
end process;

end Behavioral;

6. Once we simulate it, we will get the following simulation waveform. You can see that the first 5 memory locations(addresses 0 to 5) are correctly initialized from 1 to 5, and the rest with zeros.

simulation waveform of bram xilinx vivado


Note :- coe file has different uses. Here we used it to initialize the contents of a BRAM. For an FIR filter, we can use it to specify the filter coefficients.

I have used Xilinx Vivado 2023.2 tool for this article.

If you prefer to see this in action, watch this Youtube video I have created.



Thursday, October 7, 2010

How to use Xilinx Vivado's IP Catalog to create a BRAM? (With Testbench)

THIS ARTICLE WAS UPDATED on 18-04-2024. OLD ARTICLE USED XILINX ISE INSTEAD OF VIVADO.


    BRAM(Block Random access memory) is an advanced memory constructor that generates area and performance-optimized memories using embedded block RAM resources in Xilinx FPGAs. I hope you have already gone through the Core generator introductory tutorial before. If you haven't please read those articles here.


    We will be using Xilinx Vivado 2023.2 version for this. The steps should be the same in any version of Vivado, I believe. Let me guide you through the process with a series of screenshots.

1. Create a new project in Xilinx Vivado.
2. Click on IP Catalog, under flow navigator on the left pane of the software. You will see something like this on screen:


xilinx vivado IP catalog how to start?


2. Type "BRAM" in the search bar and you will get a list of matching results as shown below:


xilinx vivado IP catalog ; search for bram

3. Double click on Block Memory Generator, which is highlighted in the previous screenshot. A new window opens up where you can set the properties of the BRAM. You can check out the data sheet of the BRAM by clicking on Documentation. This is helpful in case you dont understand what these settings are. The component name can be changed as well.


bram IP in xilinx vivado introduction


4. Now we can start customizing the BRAM as per our requirements. Let me share the screenshots of the settings I have chosen.

bram ip in xilinx vivado settings-first page- basic.

bram ip in xilinx vivado settings-second page

bram ip setting from xilinx vivado. initialization file



We have customized our BRAM with the following settings:

  1. It will be of 256 by 8 bits in size.
  2. Algorithm will be chosen for "Low Power".
  3. Read and write width will be of 8 bit.
  4. You can initialize the memory with a coe file, but I have chosen them to be initialized with zeros.


5. Once done, you can click on the summary tab to verify that the settings are correctly chosen.


6. I have set the component name as "bram_test". Now click on the "OK" button at the bottom.

7. Click on "Generate".

generate ip in xilinx vivado

8. The tool will generate the necessary files and will add bram_test as a new source to the project. You should be able to see the new component under "Design Sources".

design sources xilinx vivado

9. Note that double clicking on the .xci file which is highlighted in the above image, will open the IP settings window once again. This means that anytime you can change the bram settings and regenerate the files if you want to.

10. To test the bram entity, we would need its port declaration. Meaning, we would need to know what are the input and output signals, their data type, size etc. To know this, we can click on the arrow, >, which shows the underlying vhdl file. This file contains the port definition which we can copy paste into our testbench code.

vhdl file generated for the ip by vivado xilinx

11. Next step would be to write a testbench code for testing the BRAM we just created. Create a new simulation source (tb_bram.vhd) and copy the following code into it:

library ieee;
use ieee.std_logic_1164.all;

--empty entity for testbench
entity tb_bram is
end tb_bram;

architecture Behavioral of tb_bram is

--copy paste the port declaration of the bram from the xilinx generated file.
component bram_test is
  PORT (
    clka : IN STD_LOGIC;
    ena : IN STD_LOGIC;
    wea : IN STD_LOGIC_VECTOR(0 DOWNTO 0);
    addra : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
    dina : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
    douta : OUT STD_LOGIC_VECTOR(7 DOWNTO 0)
  );
end component;

--port signals for connecting with the bram
signal clka, ena : std_logic := '0';
signal wea : std_logic_vector(0 downto 0) := "0";
signal addra, dina, douta : std_logic_vector(7 downto 0) := (others => '0');
constant clk_period : time := 10 ns; --clock period.

begin

--componenent instantiation using named association.
bram : bram_test port map(
    clka => clka,
    ena => ena,
    wea => wea,
    addra => addra,
    dina => dina,
    douta => douta);

--generate the clock for bram
clk_generation: process
begin
    wait for clk_period/2;
    clka <= not clka;
end process;

--this is where we generate the inputs to apply to the bram
stimulus: process
begin
    wait for clk_period;
    ena <= '1';
    wea <= "1";
    addra <= x"00"; dina <= x"A5";   wait for clk_period;
    addra <= x"04"; dina <= x"B6";   wait for clk_period;
    addra <= x"05"; dina <= x"C7";   wait for clk_period;
    ena <= '0';
    addra <= x"07"; dina <= x"D8";   wait for clk_period;
    wea <= "0";
    ena <= '1';
    addra <= x"00"; wait for clk_period;
    addra <= x"04"; wait for clk_period;
    addra <= x"05"; wait for clk_period;
    addra <= x"07"; wait for clk_period;
    wait;
end process;

end Behavioral;

12. Right click on "tb_bram.vhd" and click on "set as Top". 

13. Run Behavioral Simulation. You should get the following waveform:

simulation waveform of xilinx bram ip in vivado

    You can verify that, when the enable signal ena is low, BRAM doesnt respond to any read or write instruction. Also, there is a 2 clock cycle delay (or latency as they call it) for reading any address from bram. This is as expected as the summary page in the IP generate window says this: "Total port a read Latency: 2 clock cycles".

Note :- What we have tried out here is a very simple BRAM. But it demonstrates how we can work with Xilinx in built IPs. The design will get complicated when you go from single port to dual port RAM's. But the basic idea remains the same. By reading the documentation supplied by Xilinx you can explore more settings used in the GUI tool. For testing purpose I have used Xilinx Vivado 2023.2. The options in the IP tool may vary slightly depending on the version you are using.


If you prefer to see this in action, watch this Youtube video I have created.



Sunday, October 3, 2010

VHDL: 3 bit Magnitude Comparator With Testbench (Gate level Modeling)

    In this post I want to share the VHDL code for a 3 bit comparator which is designed using basic logic gates such as XNOR, OR, AND etc. The code was tested using a self checking testbench which tested the design for all its 64 combinations of inputs.

3 bit Comparator Circuit Diagram:



VHDL code for 3 bit comparator:


library ieee;
use ieee.std_logic_1164.all;

entity comparator is
port(a,b : in std_logic_vector(2 downto 0);  --3 bit numbers to be compared
    a_eq_b : out std_logic;  --a equals b
    a_lt_b : out std_logic;  --a less than b
    a_gt_b : out std_logic   --a greater than b
    );    
end comparator;

architecture gate_level of comparator is

--declare internal signals used in the design
signal temp1,temp2,temp3,temp4,temp5,temp6,temp7,temp8,temp9 : std_logic := '0';

begin

temp1 <= not(a(2) xor b(2));  --XNOR gate with 2 inputs.
temp2 <= not(a(1) xor b(1));  --XNOR gate with 2 inputs.
temp3 <= not(a(0) xor b(0));  --XNOR gate with 2 inputs.
temp4 <= (not a(2)) and b(2);
temp5 <= (not a(1)) and b(1);
temp6 <= (not a(0)) and b(0);
temp7 <= a(2) and (not b(2));
temp8 <= a(1) and (not b(1));
temp9 <= a(0) and (not b(0));

a_eq_b <= temp1 and temp2 and temp3;  -- for a equals b.
a_lt_b <= temp4 or (temp1 and temp5) or (temp1 and temp2 and temp6); --for a less than b
a_gt_b <= temp7 or (temp1 and temp8) or (temp1 and temp2 and temp9); --for a greater than b

end gate_level;

Testbench for 3 bit Comparator:


library ieee;
use ieee.std_logic_1164.all;
--Note that I am using this library only in testbench. Generally its 
--advised to not use this library.
use ieee.std_logic_arith.all;

--testbench has empty entity
entity tb_comparator is
end entity tb_comparator;

architecture behavioral of tb_comparator is

signal a_eq_b,a_lt_b,a_gt_b : std_logic := '0';
signal a,b :std_logic_vector(2 downto 0) := "000";

begin

--entity instantiation with named association port mapping
comparator_uut: entity work.comparator
    port map(a => a,
        b => b,
        a_eq_b => a_eq_b,
        a_lt_b => a_lt_b,
        a_gt_b => a_gt_b);

--self checking testbench. 
--checks for all combinations of inputs.
stimulus: process
begin 
    for i in 0 to 7 loop
        for j in 0 to 7  loop
            --typecast integers to std_logic_vector.
            a <= conv_std_logic_vector(i,3);
            b <= conv_std_logic_vector(j,3);
            wait for 1 ns;
            --the below statements does the automatic verification
            if(a = b) then
                assert a_eq_b = '1';
            elsif(a < b) then
                assert a_lt_b = '1';
            elsif(a > b) then
                assert a_gt_b = '1';
            end if;
        end loop; 
    end loop; 
    wait;  --testing done. wait endlessly
end process;

end behavioral;

Simulation waveform from Modelsim:


3 bit comparator simulation waveform in vhdl modelsim



Post-Synthesis Schematic from Xilinx Vivado:


Post-Synthesis Schematic from Xilinx Vivado for comparator

    You can see that 3 LUT6's were used to implement this logic. As a little homework, why don't you write the behavioral level code for a 3 bit comparator and see if the amount of LUT's used are more or less compared to what we have now. If you do this experiment, let me know the results in the comment section. 


Tuesday, September 14, 2010

VHDL: "Generate" Keyword with Examples

    Generate statement is a concurrent statement used in VHDL to describe repetitive structures. You can use generate keyword in your design to instantiate multiple components in just few lines. It can be even combined with conditional statements such as if .. else or iteration statements such as for loops.

    In the first part of this post, I will combine the generate keyword with a for loop to implement a PISO using D flipflops. In fact, in an earlier post, I have done it using individual D flipflop instantiations. You can refer to that code from here, PISO in Gate level and Behavioral level Modeling.

PISO using Generate & for keywords:


--library declaration.
library ieee;
use ieee.std_logic_1164.all;

--parallel in serial out shift register
entity piso is
port(Clk : in std_logic;  --Clock signal
    parallel_in : in std_logic_vector(3 downto 0);  --4 bit parallel load
    load : in std_logic;  --active high for loading the register
    serial_out : out std_logic  --output bit. 
    );
end piso;

architecture gate_level of piso is

signal D,Q : std_logic_vector(3 downto 0) := "0000";

begin

serial_out <= Q(3);

--entity instantiation of the D flipflop using "generate".
F :  --label name
    for i in 0 to 3 generate   --D FF is instantiated 4 times.
    begin  --"begin" statement for "generate"
    ----usual port mapping. Entity instantiation with named association
        FDRSE_inst : entity work.FDRSE1 port map   
            (Clk => Clk,
            ce => '1',
            reset => '0',
            D => D(i),
            set => '0',
            Q => Q(i));  
    end generate F;  --end "generate" block.
--The D inputs of the flip flops are controlled with the load input.
--Two AND gates with a OR gate is used for this.
D(0) <= parallel_in(3) and load;
D(1) <= (parallel_in(2) and load) or (Q(0) and not(load));
D(2) <= (parallel_in(1) and load) or (Q(1) and not(load));
D(3) <= (parallel_in(0) and load) or (Q(2) and not(load));

end gate_level;

The FDRSE flip code can be copied from here: Synchronous D Flip-Flop with Testbench.

I believe that the code is self explanatory. The for loop is used to instantiate as many number of modules as we want to instantiate. To easily port map in a programmatic way, I declare a std_logic_vector for D and Q, and use the index "i" to connect bits of the individual signals to the ports of the individual flipflops.

Johnson Counter using Generate Statement:


    Next I want to show you how to combine generate statement with both if else conditional statement and for loop, to programmatically instantiate multiple components. For this I will be using the example of a Johnson counter.

library ieee;
use ieee.std_logic_1164.all;

entity johnson_counter is
port(clk : in std_logic;
    reset : in std_logic;
    count : out std_logic_vector(3 downto 0)
    );
end johnson_counter;

architecture Behavioral of johnson_counter is

signal D,Q : std_logic_vector(3 downto 0):="0000";
signal not_Q4 : std_logic:='0';

begin

--Q of the last flipflop is fed to the D of the first flop
not_Q4 <= not Q(3);
count <= Q;  --Q is assigned to output port

--generate the instantiation statements for the 4 fliflops
--F,F0 and F1 are label names for the generate statement.
--'generate' statements should have a 'begin' and an 'end'
F : for i in 0 to 3 generate
    begin
        F0 : if (i = 0) generate  --instantiate the "first" FF only.
            begin U1 : entity work.FDRSE1 port map --usual port mapping   
                (Clk => Clk,
                ce => '1',
                reset => reset,
                D => not_Q4,
                set => '0',
                Q => Q(0));     
            end generate F0;
        F1 : if (i > 0) generate --generating the rest of the three FF's.
            begin U2 : entity work.FDRSE1 port map   --usual port mapping
                (Clk => Clk,
                ce => '1',
                reset => reset,
                D => Q(i-1),
                set => '0',
                Q => Q(i));     
            end generate F1;
    end generate F;  
   
end Behavioral;

You can get the testbench for the Johnson counter from this post: 4 bit Johnson Counter with Testbench.
The FDRSE flip code can be copied from here: Synchronous D Flip-Flop with Testbench.

    As you can see from these examples, using "generate" keyword makes your code much more smaller and neat. They even help others to go through and understand your code faster.

    Make sure to visit the older posts linked in the above post to get the testbenches and see screenshots of the simulation waveform.