Re: [cedar] DMA ring test timeout [solved]
On 06/05/2022, Amol wrote: > Hello, > > While trying to program the HD 7350 Cedar GPU to run with DPM > under the 157MHz/200MHz sclk/mclk powerstate, for single_display, > and with forced LOW performance on the SMC, the DMA ring seems > to hang. > . . . . . . > > Does this mean that the GPU doesn't support running DMA ring at the > lowest perf profile (157Mhz/200MHz)? I do still believe that this > situation might be a result of faulty/missing programming on my part, > though I am not sure what exactly it is that is at fault or is missing. The mc_reg_table was being populated with invalid entries. Thanks, Amol
[cedar] DMA ring test timeout
Hello, While trying to program the HD 7350 Cedar GPU to run with DPM under the 157MHz/200MHz sclk/mclk powerstate, for single_display, and with forced LOW performance on the SMC, the DMA ring seems to hang. After the desired power state is programmed, the DMA and CP rings 0xcafedead tests are run. The CP ring test succeeds but the DMA ring test times out. Note that the Linux radeon driver does not wait so late during its initialization to run these tests. The GPU's DMA ring RPTR is found to be at index 3 (it should be at index 4 after consuming all 4 32-bit words, when starting at index 0). Since the write-back of GPU's RPTR is successful, the DMA from GPU to system RAM works. Contents of some registers, before and after running the DMA test: DMA_STATUS: 0x44c83d57, 0x44c83156 (IDLE bit is off in the after status) GRBM_STATUS: 0x3828, 0x3828 SRBM_STATUS: 0x20c0, 0x20c0 If the DMA WRITE(2) cmd is replaced with a TRAP(7), the DMA RPTR does not even move a single step - after the timeout, it is found to be still at 0. And the IDLE status is found to be OFF. The expected interrupt isn't generated. If, instead, 4 NOPs(15) are sent, the DMA ring is again found to be stuck at RPTR=3 with IDLE status as OFF. It seems to have an affinity towards the 3rd position from the start. I also ran the CP ring test with a MEM_WRITE operation instead of the default SET_CONFIG_REG op. The test succeeds, thus proving that the CP ring can indeed DMA into the system RAM at the lowest perf profile. Does this mean that the GPU doesn't support running DMA ring at the lowest perf profile (157Mhz/200MHz)? I do still believe that this situation might be a result of faulty/missing programming on my part, though I am not sure what exactly it is that is at fault or is missing. The machine is a kvm-vfio-enabled VM; the current ArchLinux ISO fails to initialize the passthru device (-22 from radeon_device_init). Thanks, Amol
Re: Minimal GPU setup
Thank you Alex. On 07/02/2022, Deucher, Alexander wrote: > [AMD Official Use Only] > > Most of the register programming in evergreen_gpu_init is required. That > code handles things like harvesting (e.g., disabling bad hardware resources) > and setting sane asic specific settings in some registers. If you don't do > it, work may get scheduled to bad or incorrectly configured hardware blocks > which will lead to hangs or corrupted results. You can probably skip some > of them, but I don't remember what is minimally required off hand. It's > generally a good idea to re-initialize those registers anyway in case > someone has previously messed with them (e.g., manual register munging or > GPU passed through to a VM etc.). Understood. > > Posting the bios is enough to get you a working memory controller and enough > asic setup to light up displays (basically what you need for pre-OS > console). As Christian mentioned, loading the ucodes will get the > associated engines working so that you can start feeding commands to the > GPU, but without proper configuration of the various hardware blocks on the > GPU, you may not have success in feeding data to the GPU. Understood. I think I wanted a confirmation that the steps I took so far are not completely incorrect and may be just enough to see some GPU activity, before I spend more effort programming other blocks. The feedback and a small but working test helps restore the motivation. Thanks, Amol > > Alex > > > > From: amd-gfx on behalf of Amol > > Sent: Saturday, February 5, 2022 4:47 AM > To: amd-gfx@lists.freedesktop.org > Subject: Minimal GPU setup > > Hello, > > I am learning to program Radeon HD 7350 by reading the radeon > driver source in Linux, and the guides/manuals from AMD. > > I understand the general flow of initialization the driver performs. I > have also been able to understand and re-implement the ATOM > BIOS virtual machine. > > I am trying to program the device up from scratch (i.e. bare-metal). > Do I need to perform all those steps that the driver does? Reading > the evergreen_gpu_init function is demotivating; it initializes many > fields and registers which I suspect may not be required for a minimal > setup. > > Is posting the BIOS and loading the microcode enough to get me started > with running basic tasks (DMA transfers, simple packet processing, etc.)? > > Thanks, > Amol >
Re: Minimal GPU setup
Thank you Christian. On 06/02/2022, Christian König wrote: > Hi Amol, > > Am 05.02.22 um 10:47 schrieb Amol: . . . >> Is posting the BIOS and loading the microcode enough to get me started >> with running basic tasks (DMA transfers, simple packet processing, etc.)? > > Well yes and no. As bare minimum you need the following: > 1. Firmware loading > 2. Memory management > 3. Ring buffer setup > 4. Hardware initialization > > When that is done you can write commands into the ring buffers of the CP > or SDMA and see if they are executed (see the *_ring_test() functions in > the driver). SDMA is usually easier to get working. The DMA-ring-test of making the SDMA write into a WB location in the system RAM succeeded. The sequence followed mimics what the Linux driver does for the most part, until evergreen_gpu_init. That and the portions of power mgmt, interrupt mgmt, indirect buffer mgmt, the entire _modeset_init were skipped for now. The WB and the CP, DMA ring buffers are PAGE_SIZE buffers in the system RAM. GTT is a 512-entries table, in the BAR0 aperture, appropriately filled in to map the WB, CP and DMA buffers. > > When you got that working you can worry about IB (indirect buffers) > which are basically subroutines calls written into the ring buffers. > > Most commands (like copy from A to B, fill something, write value X to > memory or write X into register Y) can be used from the ring buffers > directly, but IIRC some context switching commands which are part of the > rendering process require special handling. > > But keep in mind that all of this will just be horrible slow because the > ASIC runs with the bootup clocks which are something like 100Mhz or even > only 17Mhz on very old models. To change that you need to implement > power management, interrupt handling etc etc Understood. Yes, the DPM and the IH portions. I think by programming only for the hardware I have I can manage to set them up with comparatively less effort. Thanks, Amol > > Good luck, > Christian. > >> >> Thanks, >> Amol > >
Minimal GPU setup
Hello, I am learning to program Radeon HD 7350 by reading the radeon driver source in Linux, and the guides/manuals from AMD. I understand the general flow of initialization the driver performs. I have also been able to understand and re-implement the ATOM BIOS virtual machine. I am trying to program the device up from scratch (i.e. bare-metal). Do I need to perform all those steps that the driver does? Reading the evergreen_gpu_init function is demotivating; it initializes many fields and registers which I suspect may not be required for a minimal setup. Is posting the BIOS and loading the microcode enough to get me started with running basic tasks (DMA transfers, simple packet processing, etc.)? Thanks, Amol
Re: [radeon] connector_info_from_object_table
On 19/11/2021, Alex Deucher wrote: > On Thu, Nov 18, 2021 at 11:37 AM Amol wrote: >> >> Hello, >> >> The function radeon_get_atom_connector_info_from_object_table, >> at location [1], ends up parsing ATOM_COMMON_TABLE_HEADER >> as ATOM_COMMON_RECORD_HEADER if >> enc_obj->asObjects[k].usRecordOffset is zero. It is found to be zero >> in the BIOS found at [2]. >> >> Thankfully, the loop that follows exits immediately since ucRecordSize >> is 0 because >> (ATOM_COMMON_TABLE_HEADER.usStructureSize & 0xff00) is zero. >> But, with suitable values in the usStructureSize, the loop can be made to >> run and parse garbage. >> >> A similar loop exists when parsing the conn objects. > > Can you send a patch to make it more robust? Sent on a separate email. Thanks, Amol > > Thanks, > > Alex > >> >> -Amol >> >> [1] >> https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/radeon/radeon_atombios.c#L652 >> [2] https://www.techpowerup.com/vgabios/211981/211981 >
[PATCH] drm/radeon: more sanity checks (usRecordOffset) to obj info record parsing
When parsing Encoder, Connector, or Router records, if the usRecordOffset field is 0, the driver ends up dereferencing ATOM_COMMON_TABLE_HEADER of the Object Table as ATOM_COMMON_RECORD_HEADER. A BIOS, which triggers such dereferences when parsing the Encoder records, is found on Cedar Radeon HD 7350/8350 GPU. Allow record dereferences only if usRecordOffset is non-zero. Signed-off-by: Amol Surati --- drivers/gpu/drm/radeon/radeon_atombios.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_atombios.c b/drivers/gpu/drm/radeon/radeon_atombios.c index 28c4413f4..bab0e1cc2 100644 --- a/drivers/gpu/drm/radeon/radeon_atombios.c +++ b/drivers/gpu/drm/radeon/radeon_atombios.c @@ -646,14 +646,15 @@ bool radeon_get_atom_connector_info_from_object_table(struct drm_device *dev) if (grph_obj_type == GRAPH_OBJECT_TYPE_ENCODER) { for (k = 0; k < enc_obj->ucNumberOfObjects; k++) { u16 encoder_obj = le16_to_cpu(enc_obj->asObjects[k].usObjectID); + u16 rec_offset = le16_to_cpu(enc_obj->asObjects[k].usRecordOffset); if (le16_to_cpu(path->usGraphicObjIds[j]) == encoder_obj) { ATOM_COMMON_RECORD_HEADER *record = (ATOM_COMMON_RECORD_HEADER *) - (ctx->bios + data_offset + - le16_to_cpu(enc_obj->asObjects[k].usRecordOffset)); + (ctx->bios + data_offset + rec_offset); ATOM_ENCODER_CAP_RECORD *cap_record; u16 caps = 0; - while (record->ucRecordSize > 0 && + while (rec_offset > 0 && + record->ucRecordSize > 0 && record->ucRecordType > 0 && record->ucRecordType <= ATOM_MAX_OBJECT_RECORD_NUMBER) { switch (record->ucRecordType) { @@ -677,10 +678,10 @@ bool radeon_get_atom_connector_info_from_object_table(struct drm_device *dev) } else if (grph_obj_type == GRAPH_OBJECT_TYPE_ROUTER) { for (k = 0; k < router_obj->ucNumberOfObjects; k++) { u16 router_obj_id = le16_to_cpu(router_obj->asObjects[k].usObjectID); + u16 rec_offset = le16_to_cpu(router_obj->asObjects[k].usRecordOffset); if (le16_to_cpu(path->usGraphicObjIds[j]) == router_obj_id) { ATOM_COMMON_RECORD_HEADER *record = (ATOM_COMMON_RECORD_HEADER *) - (ctx->bios + data_offset + - le16_to_cpu(router_obj->asObjects[k].usRecordOffset)); + (ctx->bios + data_offset + rec_offset); ATOM_I2C_RECORD *i2c_record; ATOM_I2C_ID_CONFIG_ACCESS *i2c_config; ATOM_ROUTER_DDC_PATH_SELECT_RECORD *ddc_path; @@ -702,7 +703,8 @@ bool radeon_get_atom_connector_info_from_object_table(struct drm_device *dev) break; } - while (record->ucRecordSize > 0 && + while (rec_offset > 0 && + record->ucRecordSize > 0 && record->ucRecordType > 0 && record->ucRecordType <= ATOM_MAX_OBJECT_RECORD_NUMBER) { switch (record->ucRecordType) { @@ -753,19 +755,18 @@ bool radeon_get_atom_connector_info_from_object_table(struct dr
[radeon] connector_info_from_object_table
Hello, The function radeon_get_atom_connector_info_from_object_table, at location [1], ends up parsing ATOM_COMMON_TABLE_HEADER as ATOM_COMMON_RECORD_HEADER if enc_obj->asObjects[k].usRecordOffset is zero. It is found to be zero in the BIOS found at [2]. Thankfully, the loop that follows exits immediately since ucRecordSize is 0 because (ATOM_COMMON_TABLE_HEADER.usStructureSize & 0xff00) is zero. But, with suitable values in the usStructureSize, the loop can be made to run and parse garbage. A similar loop exists when parsing the conn objects. -Amol [1] https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/radeon/radeon_atombios.c#L652 [2] https://www.techpowerup.com/vgabios/211981/211981