Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
On 2017年04月14日 05:34, Alex Xie wrote: According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve can return with -ERESTARTSYS. When this function was interrupted by a signal, BO should not be unref. Otherwise the BO might be released while is kmapped and pinned, or BO MIGHT be deref multiple times, etc. r = amdgpu_bo_reserve(adev->vram_scratch.robj, false); we have specified interruptible to false, so -ERESTARTSYS isn't possible here. Thanks, David Zhou Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621 Signed-off-by: Alex Xie --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 53996e3..1dcc2d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -355,8 +355,8 @@ static void amdgpu_vram_scratch_fini(struct amdgpu_device *adev) amdgpu_bo_kunmap(adev->vram_scratch.robj); amdgpu_bo_unpin(adev->vram_scratch.robj); amdgpu_bo_unreserve(adev->vram_scratch.robj); + amdgpu_bo_unref(&adev->vram_scratch.robj); } - amdgpu_bo_unref(&adev->vram_scratch.robj); } /** ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: fix to print incorrect wptr address
On 14/04/17 11:50 AM, Huang Rui wrote: > Signed-off-by: Huang Rui > --- > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > index da4559b..4736196 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > @@ -326,8 +326,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring > *ring) > "mmSDMA%i_GFX_RB_WPTR == 0x%08x " > "mmSDMA%i_GFX_RB_WPTR_HI == 0x%08x \n", > me, > - me, > lower_32_bits(ring->wptr << 2), > + me, > upper_32_bits(ring->wptr << 2)); > WREG32(sdma_v4_0_get_reg_offset(me, mmSDMA0_GFX_RB_WPTR), > lower_32_bits(ring->wptr << 2)); > WREG32(sdma_v4_0_get_reg_offset(me, mmSDMA0_GFX_RB_WPTR_HI), > upper_32_bits(ring->wptr << 2)); > Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: fix to print incorrect wptr address
Signed-off-by: Huang Rui --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index da4559b..4736196 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -326,8 +326,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring) "mmSDMA%i_GFX_RB_WPTR == 0x%08x " "mmSDMA%i_GFX_RB_WPTR_HI == 0x%08x \n", me, - me, lower_32_bits(ring->wptr << 2), + me, upper_32_bits(ring->wptr << 2)); WREG32(sdma_v4_0_get_reg_offset(me, mmSDMA0_GFX_RB_WPTR), lower_32_bits(ring->wptr << 2)); WREG32(sdma_v4_0_get_reg_offset(me, mmSDMA0_GFX_RB_WPTR_HI), upper_32_bits(ring->wptr << 2)); -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: AMDGPU without display output
PS. If you want to use X11/OGL on this no display output card, just add an option to the amdgpu kernel module "virtual_display=all" which will fake a display output so that the xserver can start with amdgpu DDX successfully. You can use remote desktop apps like VNC to view the X screen. Regards, Qiang From: amd-gfx on behalf of Dennis Schridde Sent: Thursday, April 13, 2017 11:20:22 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: Re: AMDGPU without display output Thanks, Alex! ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH split 3/3] Add support for high priority scheduling in amdgpu v9
Third part of the split of the series: Add support for high priority scheduling in amdgpu v8 This is the part of the series that is in a bit more murky water than the rest. Sending out this patch series mostly for completion. And maybe for discussion purposes as well. There are still 2 issues open with this series: 1) Is the spinlock patch still okay? Should we pursue this differently? I'd rather not use a mutex here. That would mean that to program srbm registers from an interrupt we'd need to dispatch a worker thread. That could mean extra time that the CU reservation is in place which can impact performance. So my preferred (biased) alternative is to still move to a spinlock. Another alternative I'm not sure of: Can we take advantage of the KIQ FIFO semantics to perform srbm writes atomically? Something like: ib_append(ib, PKT_WRITE_REG(SRBM_SELECT(...))) ib_append(ib, PKT_WRITE_REG(SOME_REG, VAL) ib_append(ib, PKT_WRITE_REG(SRBM_SELECT(0, 0, 0))) ib_sumbit(kiq_ring, ib) Something that makes this immediately feel wrong though is the possibility of a race condition between an srbm operation on the KIQ and one through MMIO. 2) Alex suggested changing some MMIO writes to happen on the KIQ instead. I still haven't addressed that. I'm not sure the full criteria for patches landing on -wip. But if these are good enough to fix with some followup work, I wouldn't be oppossed to that idea. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 4/4] drm/amdgpu: implement ring set_priority for gfx_v8 compute v5
Programming CP_HQD_QUEUE_PRIORITY enables a queue to take priority over other queues on the same pipe. Multiple queues on a pipe are timesliced so this gives us full precedence over other queues. Programming CP_HQD_PIPE_PRIORITY changes the SPI_ARB_PRIORITY of the wave as follows: 0x2: CS_H 0x1: CS_M 0x0: CS_L The SPI block will then dispatch work according to the policy set by SPI_ARB_PRIORITY. In the current policy CS_H is higher priority than gfx. In order to prevent getting stuck in loops of CUs bouncing between GFX and high priority compute and introducing further latency, we reserve CUs 2+ for high priority compute on-demand. v2: fix srbm_select to ring->queue and use ring->funcs->type v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: switch int to enum amd_sched_priority v5: corresponding changes for srbm_lock Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 95 ++ 3 files changed, 99 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 68350ca..4e81a8e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1057,40 +1057,43 @@ struct amdgpu_gfx { const struct firmware *pfp_fw; /* PFP firmware */ uint32_tpfp_fw_version; const struct firmware *ce_fw; /* CE firmware */ uint32_tce_fw_version; const struct firmware *rlc_fw; /* RLC firmware */ uint32_trlc_fw_version; const struct firmware *mec_fw; /* MEC firmware */ uint32_tmec_fw_version; const struct firmware *mec2_fw; /* MEC2 firmware */ uint32_tmec2_fw_version; uint32_tme_feature_version; uint32_tce_feature_version; uint32_tpfp_feature_version; uint32_trlc_feature_version; uint32_tmec_feature_version; uint32_tmec2_feature_version; struct amdgpu_ring gfx_ring[AMDGPU_MAX_GFX_RINGS]; unsignednum_gfx_rings; struct amdgpu_ring compute_ring[AMDGPU_MAX_COMPUTE_RINGS]; unsignednum_compute_rings; + spinlock_t cu_reserve_lock; + uint32_tcu_reserve_pipe_mask; + uint32_t cu_reserve_queue_mask[AMDGPU_MAX_COMPUTE_RINGS]; struct amdgpu_irq_src eop_irq; struct amdgpu_irq_src priv_reg_irq; struct amdgpu_irq_src priv_inst_irq; /* gfx status */ uint32_tgfx_current_status; /* ce ram size*/ unsignedce_ram_size; struct amdgpu_cu_info cu_info; const struct amdgpu_gfx_funcs *funcs; /* reset mask */ uint32_tgrbm_soft_reset; uint32_tsrbm_soft_reset; boolin_reset; /* s3/s4 mask */ boolin_suspend; /* NGG */ struct amdgpu_ngg ngg; }; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 674256a..971303d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1858,40 +1858,41 @@ int amdgpu_device_init(struct amdgpu_device *adev, mutex_init(&adev->firmware.mutex); mutex_init(&adev->pm.mutex); mutex_init(&adev->gfx.gpu_clock_mutex); spin_lock_init(&adev->srbm_lock); mutex_init(&adev->grbm_idx_mutex); mutex_init(&adev->mn_lock); hash_init(adev->mn_hash); amdgpu_check_arguments(adev); /* Registers mapping */ /* TODO: block userspace mapping of io register */ spin_lock_init(&adev->mmio_idx_lock); spin_lock_init(&adev->smc_idx_lock); spin_lock_init(&adev->pcie_idx_lock); spin_lock_init(&adev->uvd_ctx_idx_lock); spin_lock_init(&adev->didt_idx_lock); spin_lock_init(&adev->gc_cac_idx_lock); spin_lock_init(&adev->audio_endpt_idx_lock); spin_lock_init(&adev->mm_stats.lock); + spin_lock_init(&adev->gfx.cu_reserve_lock); INIT_LIST_HEAD(&adev->shadow_list); mutex_init(&adev->shadow_list_lock); INIT_LIST_HEAD(&adev->gtt_list); spin_lock_init(&adev->gtt_list_lock); INIT_LIST_HEAD(&adev->ring_lru_list);
[PATCH 1/4] drm/amdgpu: add parameter to allocate high priority contexts v7
Add a new context creation parameter to express a global context priority. Contexts allocated with AMDGPU_CTX_PRIORITY_HIGH will receive higher priority to schedule their work than AMDGPU_CTX_PRIORITY_NORMAL (default) contexts. v2: Instead of using flags, repurpose __pad v3: Swap enum values of _NORMAL _HIGH for backwards compatibility v4: Validate usermode priority and store it v5: Move priority validation into amdgpu_ctx_ioctl(), headline reword v6: add UAPI note regarding priorities requiring CAP_SYS_ADMIN v7: remove ctx->priority Reviewed-by: Emil Velikov Reviewed-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 36 --- drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 1 + include/uapi/drm/amdgpu_drm.h | 8 +- 3 files changed, 40 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index 1969f27..df6fc9d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -8,67 +8,75 @@ * and/or sell copies of the Software, and to permit persons to whom the * Software is furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * * Authors: monk liu */ #include #include "amdgpu.h" -static int amdgpu_ctx_init(struct amdgpu_device *adev, struct amdgpu_ctx *ctx) +static int amdgpu_ctx_init(struct amdgpu_device *adev, + enum amd_sched_priority priority, + struct amdgpu_ctx *ctx) { unsigned i, j; int r; + if (priority < 0 || priority >= AMD_SCHED_PRIORITY_MAX) + return -EINVAL; + + if (priority == AMD_SCHED_PRIORITY_HIGH && !capable(CAP_SYS_ADMIN)) + return -EACCES; + memset(ctx, 0, sizeof(*ctx)); ctx->adev = adev; kref_init(&ctx->refcount); spin_lock_init(&ctx->ring_lock); ctx->fences = kcalloc(amdgpu_sched_jobs * AMDGPU_MAX_RINGS, sizeof(struct dma_fence*), GFP_KERNEL); if (!ctx->fences) return -ENOMEM; for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { ctx->rings[i].sequence = 1; ctx->rings[i].fences = &ctx->fences[amdgpu_sched_jobs * i]; } ctx->reset_counter = atomic_read(&adev->gpu_reset_counter); /* create context entity for each ring */ for (i = 0; i < adev->num_rings; i++) { struct amdgpu_ring *ring = adev->rings[i]; struct amd_sched_rq *rq; - rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_NORMAL]; + rq = &ring->sched.sched_rq[priority]; r = amd_sched_entity_init(&ring->sched, &ctx->rings[i].entity, rq, amdgpu_sched_jobs); if (r) goto failed; } r = amdgpu_queue_mgr_init(adev, &ctx->queue_mgr); if (r) goto failed; return 0; failed: for (j = 0; j < i; j++) amd_sched_entity_fini(&adev->rings[j]->sched, &ctx->rings[j].entity); kfree(ctx->fences); ctx->fences = NULL; return r; } @@ -79,59 +87,61 @@ static void amdgpu_ctx_fini(struct amdgpu_ctx *ctx) unsigned i, j; if (!adev) return; for (i = 0; i < AMDGPU_MAX_RINGS; ++i) for (j = 0; j < amdgpu_sched_jobs; ++j) dma_fence_put(ctx->rings[i].fences[j]); kfree(ctx->fences); ctx->fences = NULL; for (i = 0; i < adev->num_rings; i++) amd_sched_entity_fini(&adev->rings[i]->sched, &ctx->rings[i].entity); amdgpu_queue_mgr_fini(adev, &ctx->queue_mgr); } static int amdgpu_ctx_alloc(struct amdgpu_device *adev, struct amdgpu_fpriv *fpriv, + enum amd_sched_priority priority, uint32_t *id) { struct amdgpu_ctx_mgr *mgr = &fpriv->ctx_mgr; struct amdgpu_ctx *ctx; int r; ctx = kmalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) return -ENOMEM; mut
[PATCH 3/4] drm/amdgpu: convert srbm lock to a spinlock v2
Replace adev->srbm_mutex with a spinlock adev->srbm_lock v2: rebased on 4.12 and included gfx9 Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 4 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 4 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 2 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 4 +-- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 20 ++--- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 34 +++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 24 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c| 4 +-- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c| 4 +-- 10 files changed, 51 insertions(+), 51 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index a9b7a61..68350ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1465,41 +1465,41 @@ struct amdgpu_device { enum amd_asic_type asic_type; uint32_tfamily; uint32_trev_id; uint32_texternal_rev_id; unsigned long flags; int usec_timeout; const struct amdgpu_asic_funcs *asic_funcs; boolshutdown; boolneed_dma32; boolaccel_working; struct work_struct reset_work; struct notifier_block acpi_nb; struct amdgpu_i2c_chan *i2c_bus[AMDGPU_MAX_I2C_BUS]; struct amdgpu_debugfs debugfs[AMDGPU_DEBUGFS_MAX_COMPONENTS]; unsigneddebugfs_count; #if defined(CONFIG_DEBUG_FS) struct dentry *debugfs_regs[AMDGPU_DEBUGFS_MAX_COMPONENTS]; #endif struct amdgpu_atif atif; struct amdgpu_atcs atcs; - struct mutexsrbm_mutex; + spinlock_t srbm_lock; /* GRBM index mutex. Protects concurrent access to GRBM index */ struct mutexgrbm_idx_mutex; struct dev_pm_domainvga_pm_domain; boolhave_disp_power_ref; /* BIOS */ boolis_atom_fw; uint8_t *bios; uint32_tbios_size; struct amdgpu_bo*stollen_vga_memory; uint32_tbios_scratch_reg_offset; uint32_tbios_scratch[AMDGPU_BIOS_NUM_SCRATCH]; /* Register/doorbell mmio */ resource_size_t rmmio_base; resource_size_t rmmio_size; void __iomem*rmmio; /* protects concurrent MM_INDEX/DATA based register access */ spinlock_t mmio_idx_lock; /* protects concurrent SMC based register access */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c index 5254562..a009990 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c @@ -152,50 +152,50 @@ static const struct kfd2kgd_calls kfd2kgd = { .write_vmid_invalidate_request = write_vmid_invalidate_request, .get_fw_version = get_fw_version }; struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void) { return (struct kfd2kgd_calls *)&kfd2kgd; } static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd) { return (struct amdgpu_device *)kgd; } static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe, uint32_t queue, uint32_t vmid) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t value = PIPEID(pipe) | MEID(mec) | VMID(vmid) | QUEUEID(queue); - mutex_lock(&adev->srbm_mutex); + spin_lock(&adev->srbm_lock); WREG32(mmSRBM_GFX_CNTL, value); } static void unlock_srbm(struct kgd_dev *kgd) { struct amdgpu_device *adev = get_amdgpu_device(kgd); WREG32(mmSRBM_GFX_CNTL, 0); - mutex_unlock(&adev->srbm_mutex); + spin_unlock(&adev->srbm_lock); } static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id, uint32_t queue_id) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1; uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec); lock_srbm(kgd, mec, pipe, queue_id, 0); } static void release_queue(struct kgd_dev *kgd) { unlock_srbm(kgd); } static void kgd_progr
[PATCH 2/4] drm/amdgpu: add framework for HW specific priority settings v6
Add an initial framework for changing the HW priorities of rings. The framework allows requesting priority changes for the lifetime of an amdgpu_job. After the job completes the priority will decay to the next lowest priority for which a request is still valid. A new ring function set_priority() can now be populated to take care of the HW specific programming sequence for priority changes. v2: set priority before emitting IB, and take a ref on amdgpu_job v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb v5: use atomic for tracking job priorities instead of last_job v6: rename amdgpu_ring_priority_[get/put]() and align parameters Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 78 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 15 ++ drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 7 +++ 4 files changed, 106 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 86a1242..ac90dfc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -78,40 +78,41 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size, return r; } void amdgpu_job_free_resources(struct amdgpu_job *job) { struct dma_fence *f; unsigned i; /* use sched fence if available */ f = job->base.s_fence ? &job->base.s_fence->finished : job->fence; for (i = 0; i < job->num_ibs; ++i) amdgpu_ib_free(job->adev, &job->ibs[i], f); } static void amdgpu_job_free_cb(struct amd_sched_job *s_job) { struct amdgpu_job *job = container_of(s_job, struct amdgpu_job, base); + amdgpu_ring_priority_put(job->ring, amd_sched_get_job_priority(s_job)); dma_fence_put(job->fence); amdgpu_sync_free(&job->sync); kfree(job); } void amdgpu_job_free(struct amdgpu_job *job) { amdgpu_job_free_resources(job); dma_fence_put(job->fence); amdgpu_sync_free(&job->sync); kfree(job); } int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring, struct amd_sched_entity *entity, void *owner, struct dma_fence **f) { int r; job->ring = ring; @@ -152,38 +153,44 @@ static struct dma_fence *amdgpu_job_dependency(struct amd_sched_job *sched_job) fence = amdgpu_sync_get_fence(&job->sync); } return fence; } static struct dma_fence *amdgpu_job_run(struct amd_sched_job *sched_job) { struct dma_fence *fence = NULL; struct amdgpu_job *job; int r; if (!sched_job) { DRM_ERROR("job is null\n"); return NULL; } job = to_amdgpu_job(sched_job); BUG_ON(amdgpu_sync_peek_fence(&job->sync, NULL)); + r = amdgpu_ring_priority_get(job->ring, +amd_sched_get_job_priority(&job->base)); + if (r) + DRM_ERROR("Failed to set job priority (%d)\n", r); + trace_amdgpu_sched_run_job(job); r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs, job, &fence); if (r) DRM_ERROR("Error scheduling IBs (%d)\n", r); /* if gpu reset, hw fence will be replaced here */ dma_fence_put(job->fence); job->fence = dma_fence_get(fence); + amdgpu_job_free_resources(job); return fence; } const struct amd_sched_backend_ops amdgpu_sched_ops = { .dependency = amdgpu_job_dependency, .run_job = amdgpu_job_run, .timedout_job = amdgpu_job_timedout, .free_job = amdgpu_job_free_cb }; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index 7486277..09fa8f7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -183,55 +183,126 @@ void amdgpu_ring_commit(struct amdgpu_ring *ring) amdgpu_ring_lru_touch(ring->adev, ring); } /** * amdgpu_ring_undo - reset the wptr * * @ring: amdgpu_ring structure holding ring information * * Reset the driver's copy of the wptr (all asics). */ void amdgpu_ring_undo(struct amdgpu_ring *ring) { ring->wptr = ring->wptr_old; if (ring->funcs->end_use) ring->funcs->end_use(ring); } /** + * amdgpu_ring_priority_put - restore a ring's priority + * + * @ring: amdgpu_ring structure holding the information + * @priority: target priority + * + * Release a request for executing at @priority + */ +void amdgpu_ring_priority_put(struct amdgpu_ring *ring, + enum amd_sched_priority priority) +{ + int i; + + if (!ring->funcs->set_priority) + return;
[PATCH 5/6] drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3
Depending on usage patterns, the current LRU policy may create a non-injective mapping between userspace ring ids and kernel rings. This behaviour is undesired as apps that attempt to fill all HW blocks would be unable to reach some of them. This change forces the LRU policy to create bijective mappings only. v2: compress ring_blacklist v3: simplify amdgpu_ring_is_blacklisted() logic Signed-off-by: Andres Rodriguez Reviewed-by: Nicolai Hähnle --- drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 16 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 33 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++-- 3 files changed, 42 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c index 054d750..5a7c691 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c @@ -98,44 +98,56 @@ static enum amdgpu_ring_type amdgpu_hw_ip_to_ring_type(int hw_ip) return AMDGPU_RING_TYPE_GFX; case AMDGPU_HW_IP_COMPUTE: return AMDGPU_RING_TYPE_COMPUTE; case AMDGPU_HW_IP_DMA: return AMDGPU_RING_TYPE_SDMA; case AMDGPU_HW_IP_UVD: return AMDGPU_RING_TYPE_UVD; case AMDGPU_HW_IP_VCE: return AMDGPU_RING_TYPE_VCE; default: DRM_ERROR("Invalid HW IP specified %d\n", hw_ip); return -1; } } static int amdgpu_lru_map(struct amdgpu_device *adev, struct amdgpu_queue_mapper *mapper, int user_ring, struct amdgpu_ring **out_ring) { - int r; + int r, i, j; int ring_type = amdgpu_hw_ip_to_ring_type(mapper->hw_ip); + int ring_blacklist[AMDGPU_MAX_RINGS]; + struct amdgpu_ring *ring; - r = amdgpu_ring_lru_get(adev, ring_type, out_ring); + /* 0 is a valid ring index, so initialize to -1 */ + memset(ring_blacklist, 0xff, sizeof(ring_blacklist)); + + for (i = 0, j = 0; i < AMDGPU_MAX_RINGS; i++) { + ring = mapper->queue_map[i]; + if (ring) + ring_blacklist[j++] = ring->idx; + } + + r = amdgpu_ring_lru_get(adev, ring_type, ring_blacklist, + j, out_ring); if (r) return r; return amdgpu_update_cached_map(mapper, user_ring, *out_ring); } /** * amdgpu_queue_mgr_init - init an amdgpu_queue_mgr struct * * @adev: amdgpu_device pointer * @mgr: amdgpu_queue_mgr structure holding queue information * * Initialize the the selected @mgr (all asics). * * Returns 0 on success, error on failure. */ int amdgpu_queue_mgr_init(struct amdgpu_device *adev, struct amdgpu_queue_mgr *mgr) { int i, r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index 2b452b0..7486277 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -333,66 +333,85 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring) amdgpu_wb_free(ring->adev, ring->wptr_offs); } amdgpu_bo_free_kernel(&ring->ring_obj, &ring->gpu_addr, (void **)&ring->ring); amdgpu_debugfs_ring_fini(ring); ring->adev->rings[ring->idx] = NULL; } static void amdgpu_ring_lru_touch_locked(struct amdgpu_device *adev, struct amdgpu_ring *ring) { /* list_move_tail handles the case where ring isn't part of the list */ list_move_tail(&ring->lru_list, &adev->ring_lru_list); } +static bool amdgpu_ring_is_blacklisted(struct amdgpu_ring *ring, + int *blacklist, int num_blacklist) +{ + int i; + + for (i = 0; i < num_blacklist; i++) { + if (ring->idx == blacklist[i]) + return true; + } + + return false; +} + /** * amdgpu_ring_lru_get - get the least recently used ring for a HW IP block * * @adev: amdgpu_device pointer * @type: amdgpu_ring_type enum + * @blacklist: blacklisted ring ids array + * @num_blacklist: number of entries in @blacklist * @ring: output ring * * Retrieve the amdgpu_ring structure for the least recently used ring of * a specific IP block (all asics). * Returns 0 on success, error on failure. */ -int amdgpu_ring_lru_get(struct amdgpu_device *adev, int type, - struct amdgpu_ring **ring) +int amdgpu_ring_lru_get(struct amdgpu_device *adev, int type, int *blacklist, + int num_blacklist, struct amdgpu_ring **ring) { struct amdgpu_ring *entry; /* List is sorted in LRU order, find first entry corresponding * to the desired
[PATCH split 2/3] LRU map compute/SDMA user ring ids to kernel ring ids
Second part of the split of the series: Add support for high priority scheduling in amdgpu v8 These patches should be close to being good enough to land. The first two patches are simple fixes I've ported from the ROCm branch. These still need review. I've fixed all of Christian's comments for patch 04: drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/6] drm/amdgpu: condense mqd programming sequence
The MQD structure matches the reg layout. Take advantage of this to simplify HQD programming. Note that the ACTIVE field still needs to be programmed last. Suggested-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 44 +- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 84 +-- 2 files changed, 23 insertions(+), 105 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index c0844a5..85321d6 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -3118,81 +3118,59 @@ static void gfx_v7_0_mqd_init(struct amdgpu_device *adev, mqd->cp_hqd_ib_rptr = RREG32(mmCP_HQD_IB_RPTR); mqd->cp_hqd_persistent_state = RREG32(mmCP_HQD_PERSISTENT_STATE); mqd->cp_hqd_sema_cmd = RREG32(mmCP_HQD_SEMA_CMD); mqd->cp_hqd_msg_type = RREG32(mmCP_HQD_MSG_TYPE); mqd->cp_hqd_atomic0_preop_lo = RREG32(mmCP_HQD_ATOMIC0_PREOP_LO); mqd->cp_hqd_atomic0_preop_hi = RREG32(mmCP_HQD_ATOMIC0_PREOP_HI); mqd->cp_hqd_atomic1_preop_lo = RREG32(mmCP_HQD_ATOMIC1_PREOP_LO); mqd->cp_hqd_atomic1_preop_hi = RREG32(mmCP_HQD_ATOMIC1_PREOP_HI); mqd->cp_hqd_pq_rptr = RREG32(mmCP_HQD_PQ_RPTR); mqd->cp_hqd_quantum = RREG32(mmCP_HQD_QUANTUM); mqd->cp_hqd_pipe_priority = RREG32(mmCP_HQD_PIPE_PRIORITY); mqd->cp_hqd_queue_priority = RREG32(mmCP_HQD_QUEUE_PRIORITY); mqd->cp_hqd_iq_rptr = RREG32(mmCP_HQD_IQ_RPTR); /* activate the queue */ mqd->cp_hqd_active = 1; } int gfx_v7_0_mqd_commit(struct amdgpu_device *adev, struct cik_mqd *mqd) { - u32 tmp; + uint32_t tmp; + uint32_t mqd_reg; + uint32_t *mqd_data; + + /* HQD registers extend from mmCP_MQD_BASE_ADDR to mmCP_MQD_CONTROL */ + mqd_data = &mqd->cp_mqd_base_addr_lo; /* disable wptr polling */ tmp = RREG32(mmCP_PQ_WPTR_POLL_CNTL); tmp = REG_SET_FIELD(tmp, CP_PQ_WPTR_POLL_CNTL, EN, 0); WREG32(mmCP_PQ_WPTR_POLL_CNTL, tmp); - /* program MQD field to HW */ - WREG32(mmCP_MQD_BASE_ADDR, mqd->cp_mqd_base_addr_lo); - WREG32(mmCP_MQD_BASE_ADDR_HI, mqd->cp_mqd_base_addr_hi); - WREG32(mmCP_MQD_CONTROL, mqd->cp_mqd_control); - WREG32(mmCP_HQD_PQ_BASE, mqd->cp_hqd_pq_base_lo); - WREG32(mmCP_HQD_PQ_BASE_HI, mqd->cp_hqd_pq_base_hi); - WREG32(mmCP_HQD_PQ_CONTROL, mqd->cp_hqd_pq_control); - WREG32(mmCP_HQD_PQ_WPTR_POLL_ADDR, mqd->cp_hqd_pq_wptr_poll_addr_lo); - WREG32(mmCP_HQD_PQ_WPTR_POLL_ADDR_HI, mqd->cp_hqd_pq_wptr_poll_addr_hi); - WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR, mqd->cp_hqd_pq_rptr_report_addr_lo); - WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR_HI, mqd->cp_hqd_pq_rptr_report_addr_hi); - WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, mqd->cp_hqd_pq_doorbell_control); - WREG32(mmCP_HQD_PQ_WPTR, mqd->cp_hqd_pq_wptr); - WREG32(mmCP_HQD_VMID, mqd->cp_hqd_vmid); - - WREG32(mmCP_HQD_IB_CONTROL, mqd->cp_hqd_ib_control); - WREG32(mmCP_HQD_IB_BASE_ADDR, mqd->cp_hqd_ib_base_addr_lo); - WREG32(mmCP_HQD_IB_BASE_ADDR_HI, mqd->cp_hqd_ib_base_addr_hi); - WREG32(mmCP_HQD_IB_RPTR, mqd->cp_hqd_ib_rptr); - WREG32(mmCP_HQD_PERSISTENT_STATE, mqd->cp_hqd_persistent_state); - WREG32(mmCP_HQD_SEMA_CMD, mqd->cp_hqd_sema_cmd); - WREG32(mmCP_HQD_MSG_TYPE, mqd->cp_hqd_msg_type); - WREG32(mmCP_HQD_ATOMIC0_PREOP_LO, mqd->cp_hqd_atomic0_preop_lo); - WREG32(mmCP_HQD_ATOMIC0_PREOP_HI, mqd->cp_hqd_atomic0_preop_hi); - WREG32(mmCP_HQD_ATOMIC1_PREOP_LO, mqd->cp_hqd_atomic1_preop_lo); - WREG32(mmCP_HQD_ATOMIC1_PREOP_HI, mqd->cp_hqd_atomic1_preop_hi); - WREG32(mmCP_HQD_PQ_RPTR, mqd->cp_hqd_pq_rptr); - WREG32(mmCP_HQD_QUANTUM, mqd->cp_hqd_quantum); - WREG32(mmCP_HQD_PIPE_PRIORITY, mqd->cp_hqd_pipe_priority); - WREG32(mmCP_HQD_QUEUE_PRIORITY, mqd->cp_hqd_queue_priority); - WREG32(mmCP_HQD_IQ_RPTR, mqd->cp_hqd_iq_rptr); + /* program all HQD registers */ + for (mqd_reg = mmCP_HQD_VMID; mqd_reg <= mmCP_MQD_CONTROL; mqd_reg++) + WREG32(mqd_reg, mqd_data[mqd_reg - mmCP_MQD_BASE_ADDR]); /* activate the HQD */ - WREG32(mmCP_HQD_ACTIVE, mqd->cp_hqd_active); + for (mqd_reg = mmCP_MQD_BASE_ADDR; mqd_reg <= mmCP_HQD_ACTIVE; mqd_reg++) + WREG32(mqd_reg, mqd_data[mqd_reg - mmCP_MQD_BASE_ADDR]); return 0; } static int gfx_v7_0_compute_queue_init(struct amdgpu_device *adev, int ring_id) { int r; u64 mqd_gpu_addr; struct cik_mqd *mqd; struct amdgpu_ring *ring = &adev->gfx.compute_ring[ring_id]; if (ring->mqd_obj == NULL) { r = amdgpu_bo_create(adev, sizeof(struct cik_mqd), PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_GTT, 0,
[PATCH 4/6] drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4
Use an LRU policy to map usermode rings to HW compute queues. Most compute clients use one queue, and usually the first queue available. This results in poor pipe/queue work distribution when multiple compute apps are running. In most cases pipe 0 queue 0 is the only queue that gets used. In order to better distribute work across multiple HW queues, we adopt a policy to map the usermode ring ids to the LRU HW queue. This fixes a large majority of multi-app compute workloads sharing the same HW queue, even though 7 other queues are available. v2: use ring->funcs->type instead of ring->hw_ip v3: remove amdgpu_queue_mapper_funcs v4: change ring_lru_list_lock to spinlock, grab only once in lru_get() Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 3 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 38 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 63 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++ 5 files changed, 110 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 1d9053f..a9b7a61 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1617,40 +1617,43 @@ struct amdgpu_device { int num_ip_blocks; struct mutexmn_lock; DECLARE_HASHTABLE(mn_hash, 7); /* tracking pinned memory */ u64 vram_pin_size; u64 invisible_pin_size; u64 gart_pin_size; /* amdkfd interface */ struct kfd_dev *kfd; struct amdgpu_virt virt; /* link all shadow bo */ struct list_headshadow_list; struct mutexshadow_list_lock; /* link all gtt */ spinlock_t gtt_list_lock; struct list_headgtt_list; + /* keep an lru list of rings by HW IP */ + struct list_headring_lru_list; + spinlock_t ring_lru_list_lock; /* record hw reset is performed */ bool has_hw_reset; }; static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev) { return container_of(bdev, struct amdgpu_device, mman.bdev); } bool amdgpu_device_is_px(struct drm_device *dev); int amdgpu_device_init(struct amdgpu_device *adev, struct drm_device *ddev, struct pci_dev *pdev, uint32_t flags); void amdgpu_device_fini(struct amdgpu_device *adev); int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev); uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 724b4c1..2acceef 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -1865,40 +1865,43 @@ int amdgpu_device_init(struct amdgpu_device *adev, amdgpu_check_arguments(adev); /* Registers mapping */ /* TODO: block userspace mapping of io register */ spin_lock_init(&adev->mmio_idx_lock); spin_lock_init(&adev->smc_idx_lock); spin_lock_init(&adev->pcie_idx_lock); spin_lock_init(&adev->uvd_ctx_idx_lock); spin_lock_init(&adev->didt_idx_lock); spin_lock_init(&adev->gc_cac_idx_lock); spin_lock_init(&adev->audio_endpt_idx_lock); spin_lock_init(&adev->mm_stats.lock); INIT_LIST_HEAD(&adev->shadow_list); mutex_init(&adev->shadow_list_lock); INIT_LIST_HEAD(&adev->gtt_list); spin_lock_init(&adev->gtt_list_lock); + INIT_LIST_HEAD(&adev->ring_lru_list); + spin_lock_init(&adev->ring_lru_list_lock); + if (adev->asic_type >= CHIP_BONAIRE) { adev->rmmio_base = pci_resource_start(adev->pdev, 5); adev->rmmio_size = pci_resource_len(adev->pdev, 5); } else { adev->rmmio_base = pci_resource_start(adev->pdev, 2); adev->rmmio_size = pci_resource_len(adev->pdev, 2); } adev->rmmio = ioremap(adev->rmmio_base, adev->rmmio_size); if (adev->rmmio == NULL) { return -ENOMEM; } DRM_INFO("register mmio base: 0x%08X\n", (uint32_t)adev->rmmio_base); DRM_INFO("register mmio size: %u\n", (unsigned)adev->rmmio_size); if (adev->asic_type >= CHIP_BONAIRE) /* doorbell bar mapping */ amdgpu_doorbell_init(adev); /* io port mapping */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c index 3e9ac80..054d750 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c @@ -74,40 +74,74 @@ sta
[PATCH 3/6] drm/amdgpu: untie user ring ids from kernel ring ids v4
Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's ring ids from the kernel's ring ids. The queue manager maintains a per-file descriptor map of user ring ids to amdgpu_ring pointers. Once a map is created it is permanent (this is required to maintain FIFO execution guarantees for a context's ring). Different queue map policies can be configured for each HW IP. Currently all HW IPs use the identity mapper, i.e. kernel ring id is equal to the user ring id. The purpose of this mechanism is to distribute the load across multiple queues more effectively for HW IPs that support multiple rings. Userspace clients are unable to check whether a specific resource is in use by a different client. Therefore, it is up to the kernel driver to make the optimal choice. v2: remove amdgpu_queue_mapper_funcs v3: made amdgpu_queue_mgr per context instead of per-fd v4: add context_put on error paths Reviewed-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/Makefile | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 27 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 117 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 230 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 45 + drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 + 7 files changed, 335 insertions(+), 95 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 660786a..dd48eb2 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -7,41 +7,42 @@ FULL_AMD_PATH=$(src)/.. ccflags-y := -Iinclude/drm -I$(FULL_AMD_PATH)/include/asic_reg \ -I$(FULL_AMD_PATH)/include \ -I$(FULL_AMD_PATH)/amdgpu \ -I$(FULL_AMD_PATH)/scheduler \ -I$(FULL_AMD_PATH)/powerplay/inc \ -I$(FULL_AMD_PATH)/acp/include amdgpu-y := amdgpu_drv.o # add KMS driver amdgpu-y += amdgpu_device.o amdgpu_kms.o \ amdgpu_atombios.o atombios_crtc.o amdgpu_connectors.o \ atom.o amdgpu_fence.o amdgpu_ttm.o amdgpu_object.o amdgpu_gart.o \ amdgpu_encoders.o amdgpu_display.o amdgpu_i2c.o \ amdgpu_fb.o amdgpu_gem.o amdgpu_ring.o \ amdgpu_cs.o amdgpu_bios.o amdgpu_benchmark.o amdgpu_test.o \ amdgpu_pm.o atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \ amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ - amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o + amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \ + amdgpu_queue_mgr.o # add asic specific block amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \ ci_smc.o ci_dpm.o dce_v8_0.o gfx_v7_0.o cik_sdma.o uvd_v4_2.o vce_v2_0.o \ amdgpu_amdkfd_gfx_v7.o amdgpu-$(CONFIG_DRM_AMDGPU_SI)+= si.o gmc_v6_0.o gfx_v6_0.o si_ih.o si_dma.o dce_v6_0.o si_dpm.o si_smc.o amdgpu-y += \ vi.o mxgpu_vi.o nbio_v6_1.o soc15.o mxgpu_ai.o # add GMC block amdgpu-y += \ gmc_v7_0.o \ gmc_v8_0.o \ gfxhub_v1_0.o mmhub_v1_0.o gmc_v9_0.o # add IH block amdgpu-y += \ amdgpu_irq.o \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 0a58575..1d9053f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -756,52 +756,76 @@ struct amdgpu_ib { uint32_tlength_dw; uint64_tgpu_addr; uint32_t*ptr; uint32_tflags; }; extern const struct amd_sched_backend_ops amdgpu_sched_ops; int amdgpu_job_alloc(struct amdgpu_device *adev, unsigned num_ibs, struct amdgpu_job **job, struct amdgpu_vm *vm); int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size, struct amdgpu_job **job); void amdgpu_job_free_resources(struct amdgpu_job *job); void amdgpu_job_free(struct amdgpu_job *job); int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring, struct amd_sched_entity *entity, void *owner, struct dma_fence **f); /* + * Queue manager + */ +struct amdgpu_queue_mapper { + int hw_ip; + struct mutexlock; + /* protected by lock */ + struct amdgpu_ring *queue_map[AMDGPU_MAX_RINGS]; +}; + +struct amdgpu_queue_mgr { + struct amdgpu_queue_mapper mapper[AMDGPU_MAX_IP_NUM]; +}; + +int amdgpu_queue_mgr_init(struct amdgpu_device *adev, + struct amdgpu_queue_mgr *mgr); +int amdgpu_queue_mgr_fini(struct amdgpu_device *ad
[PATCH 2/6] drm/amdgpu: workaround tonga HW bug in HQD programming sequence
Tonga based asics may experience hangs when an HQD's EOP parameters are modified. Workaround this HW issue by avoiding writes to these registers for tonga asics. Based on the following ROCm commit: 2a0fb8 - drm/amdgpu: Synchronize KFD HQD load protocol with CP scheduler From the ROCm git repository: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver.git CC: Jay Cornwall Suggested-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index 0f1b62d..b9e0ded 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -5033,41 +5033,55 @@ static int gfx_v8_0_mqd_init(struct amdgpu_ring *ring) /* activate the queue */ mqd->cp_hqd_active = 1; return 0; } int gfx_v8_0_mqd_commit(struct amdgpu_device *adev, struct vi_mqd *mqd) { uint32_t mqd_reg; uint32_t *mqd_data; /* HQD registers extend from mmCP_MQD_BASE_ADDR to mmCP_HQD_ERROR */ mqd_data = &mqd->cp_mqd_base_addr_lo; /* disable wptr polling */ WREG32_FIELD(CP_PQ_WPTR_POLL_CNTL, EN, 0); /* program all HQD registers */ - for (mqd_reg = mmCP_HQD_VMID; mqd_reg <= mmCP_HQD_ERROR; mqd_reg++) + for (mqd_reg = mmCP_HQD_VMID; mqd_reg <= mmCP_HQD_EOP_CONTROL; mqd_reg++) + WREG32(mqd_reg, mqd_data[mqd_reg - mmCP_MQD_BASE_ADDR]); + + /* Tonga errata: EOP RPTR/WPTR should be left unmodified. +* This is safe since EOP RPTR==WPTR for any inactive HQD +* on ASICs that do not support context-save. +* EOP writes/reads can start anywhere in the ring. +*/ + if (adev->asic_type != CHIP_TONGA) { + WREG32(mmCP_HQD_EOP_RPTR, mqd->cp_hqd_eop_rptr); + WREG32(mmCP_HQD_EOP_WPTR, mqd->cp_hqd_eop_wptr); + WREG32(mmCP_HQD_EOP_WPTR_MEM, mqd->cp_hqd_eop_wptr_mem); + } + + for (mqd_reg = mmCP_HQD_EOP_EVENTS; mqd_reg <= mmCP_HQD_ERROR; mqd_reg++) WREG32(mqd_reg, mqd_data[mqd_reg - mmCP_MQD_BASE_ADDR]); /* activate the HQD */ for (mqd_reg = mmCP_MQD_BASE_ADDR; mqd_reg <= mmCP_HQD_ACTIVE; mqd_reg++) WREG32(mqd_reg, mqd_data[mqd_reg - mmCP_MQD_BASE_ADDR]); return 0; } static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring *ring) { int r = 0; struct amdgpu_device *adev = ring->adev; struct vi_mqd *mqd = ring->mqd_ptr; int mqd_idx = AMDGPU_MAX_COMPUTE_RINGS; gfx_v8_0_kiq_setting(ring); if (adev->gfx.in_reset) { /* for GPU_RESET case */ /* reset MQD to a clean status */ -- 2.9.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 6/6] drm/amdgpu: use LRU mapping policy for SDMA engines
Spreading the load across multiple SDMA engines can increase memory transfer performance. Signed-off-by: Andres Rodriguez Reviewed-by: Nicolai Hähnle --- drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c index 5a7c691..e8984df 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c @@ -241,38 +241,38 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev, return -EINVAL; } if (ring >= ip_num_rings) { DRM_ERROR("Ring index:%d exceeds maximum:%d for ip:%d\n", ring, ip_num_rings, hw_ip); return -EINVAL; } mutex_lock(&mapper->lock); *out_ring = amdgpu_get_cached_map(mapper, ring); if (*out_ring) { /* cache hit */ r = 0; goto out_unlock; } switch (mapper->hw_ip) { case AMDGPU_HW_IP_GFX: - case AMDGPU_HW_IP_DMA: case AMDGPU_HW_IP_UVD: case AMDGPU_HW_IP_VCE: r = amdgpu_identity_map(adev, mapper, ring, out_ring); break; + case AMDGPU_HW_IP_DMA: case AMDGPU_HW_IP_COMPUTE: r = amdgpu_lru_map(adev, mapper, ring, out_ring); break; default: *out_ring = NULL; r = -EINVAL; DRM_ERROR("unknown HW IP type: %d\n", mapper->hw_ip); } out_unlock: mutex_unlock(&mapper->lock); return r; } -- 2.9.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH split] Improve pipe split between amdgpu and amdkfd
Forgot to mention: * Re-ordered some patches as suggested by Felix * Included "drm: Fix get_property logic fumble" during testing, otherwise the system boots to a black screen. Regards, Andres On 2017-04-13 05:35 PM, Andres Rodriguez wrote: This is a split of patches that are ready to land from the series: Add support for high priority scheduling in amdgpu v8 I've included Felix and Alex's feedback from the thread above. This includes: * Separate MEC_HPD_SIZE rename into a separate patch (patch 01) * Added a patch to fix the kgd_hqd_load bug Felix pointed out (patch 06) * Fixes for various off-by-one errors * Use gfx_v8_0_deactivate_hqd Only comment I didn't address was changing the queue allocation policy for gfx9 (similar to gfx7/8). See inline reply in that thread for more details on why this was skipped. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 10/17] drm/amdgpu: allow split of queues with kfd at queue granularity v4
Previously the queue/pipe split with kfd operated with pipe granularity. This patch allows amdgpu to take ownership of an arbitrary set of queues. It also consolidates the last few magic numbers in the compute initialization process into mec_init. v2: support for gfx9 v3: renamed AMDGPU_MAX_QUEUES to AMDGPU_MAX_COMPUTE_QUEUES v4: fix off-by-one in num_mec checks in *_compute_queue_acquire Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 7 +++ drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 82 +--- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 81 +++- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 84 +++-- drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 1 + 5 files changed, 211 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 6b294d2..61990be 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -29,40 +29,42 @@ #define __AMDGPU_H__ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include + #include "amd_shared.h" #include "amdgpu_mode.h" #include "amdgpu_ih.h" #include "amdgpu_irq.h" #include "amdgpu_ucode.h" #include "amdgpu_ttm.h" #include "amdgpu_psp.h" #include "amdgpu_gds.h" #include "amdgpu_sync.h" #include "amdgpu_ring.h" #include "amdgpu_vm.h" #include "amd_powerplay.h" #include "amdgpu_dpm.h" #include "amdgpu_acp.h" #include "amdgpu_uvd.h" #include "amdgpu_vce.h" #include "gpu_scheduler.h" #include "amdgpu_virt.h" @@ -875,49 +877,54 @@ struct amdgpu_rlc { /* safe mode for updating CG/PG state */ bool in_safe_mode; const struct amdgpu_rlc_funcs *funcs; /* for firmware data */ u32 save_and_restore_offset; u32 clear_state_descriptor_offset; u32 avail_scratch_ram_locations; u32 reg_restore_list_size; u32 reg_list_format_start; u32 reg_list_format_separate_start; u32 starting_offsets_start; u32 reg_list_format_size_bytes; u32 reg_list_size_bytes; u32 *register_list_format; u32 *register_restore; }; +#define AMDGPU_MAX_COMPUTE_QUEUES KGD_MAX_QUEUES + struct amdgpu_mec { struct amdgpu_bo*hpd_eop_obj; u64 hpd_eop_gpu_addr; struct amdgpu_bo*mec_fw_obj; u64 mec_fw_gpu_addr; u32 num_mec; u32 num_pipe_per_mec; u32 num_queue_per_pipe; void*mqd_backup[AMDGPU_MAX_COMPUTE_RINGS + 1]; + + /* These are the resources for which amdgpu takes ownership */ + DECLARE_BITMAP(queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES); }; struct amdgpu_kiq { u64 eop_gpu_addr; struct amdgpu_bo*eop_obj; struct amdgpu_ring ring; struct amdgpu_irq_src irq; }; /* * GPU scratch registers structures, functions & helpers */ struct amdgpu_scratch { unsignednum_reg; uint32_treg_base; uint32_tfree_mask; }; /* * GFX configurations diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 41bda98..8520b4b 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -32,41 +32,40 @@ #include "amdgpu_ucode.h" #include "clearstate_ci.h" #include "dce/dce_8_0_d.h" #include "dce/dce_8_0_sh_mask.h" #include "bif/bif_4_1_d.h" #include "bif/bif_4_1_sh_mask.h" #include "gca/gfx_7_0_d.h" #include "gca/gfx_7_2_enum.h" #include "gca/gfx_7_2_sh_mask.h" #include "gmc/gmc_7_0_d.h" #include "gmc/gmc_7_0_sh_mask.h" #include "oss/oss_2_0_d.h" #include "oss/oss_2_0_sh_mask.h" #define GFX7_NUM_GFX_RINGS 1 -#define GFX7_NUM_COMPUTE_RINGS 8 #define GFX7_MEC_HPD_SIZE 2048 static void gfx_v7_0_set_ring_funcs(struct amdgpu_device *adev); static void gfx_v7_0_set_irq_funcs(struct amdgpu_device *adev); static void gfx_v7_0_set_gds_init(struct amdgpu_device *adev); MODULE_FIRMWARE("radeon/bonaire_pfp.bin"); MODULE_FIRMWARE("radeon/bonaire_me.bin"); MODULE_FIRMWARE("radeon/bonaire_ce.bin"); MODULE_FIRMWARE("radeon/bonaire_rlc.bin"); MODULE_FIRMWARE("radeon/bonaire_mec.bin"); MODULE_FIRMWARE("radeon/hawaii_pfp.bin"); MODULE_FIRMWARE("radeon/hawaii_me.bin"); MODULE_FIRMWARE("radeon/hawaii_ce.bin"); MODULE_FIRMWARE("radeon/hawaii_rlc.bin"); MODULE_FIRMWARE("radeon/hawaii_mec.bin"); MODULE_FIRMWARE("radeon/kaveri_pfp.bin"); MODULE_FIRMWARE("radeon/kaveri_me.bin"); @@ -2806,67 +2805,98 @@ static void gfx_v7_0_cp_compute_fini(struct amdgpu_device *adev)
[PATCH 15/17] drm/amdgpu: remove hardcoded queue_mask in PACKET3_SET_RESOURCES
The assumption that we are only using the first pipe no longer holds. Instead, calculate the queue_mask from the queue_bitmap. Acked-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 20 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 23 +-- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index 90e1dd3..ff77351 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -4697,60 +4697,76 @@ static int gfx_v8_0_cp_compute_load_microcode(struct amdgpu_device *adev) /* KIQ functions */ static void gfx_v8_0_kiq_setting(struct amdgpu_ring *ring) { uint32_t tmp; struct amdgpu_device *adev = ring->adev; /* tell RLC which is KIQ queue */ tmp = RREG32(mmRLC_CP_SCHEDULERS); tmp &= 0xff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); WREG32(mmRLC_CP_SCHEDULERS, tmp); tmp |= 0x80; WREG32(mmRLC_CP_SCHEDULERS, tmp); } static int gfx_v8_0_kiq_kcq_enable(struct amdgpu_device *adev) { struct amdgpu_ring *kiq_ring = &adev->gfx.kiq.ring; uint32_t scratch, tmp = 0; + uint64_t queue_mask = 0; int r, i; + for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) { + if (!test_bit(i, adev->gfx.mec.queue_bitmap)) + continue; + + /* This situation may be hit in the future if a new HW +* generation exposes more than 64 queues. If so, the +* definition of queue_mask needs updating */ + if (WARN_ON(i > (sizeof(queue_mask)*8))) { + DRM_ERROR("Invalid KCQ enabled: %d\n", i); + break; + } + + queue_mask |= (1ull << i); + } + r = amdgpu_gfx_scratch_get(adev, &scratch); if (r) { DRM_ERROR("Failed to get scratch reg (%d).\n", r); return r; } WREG32(scratch, 0xCAFEDEAD); r = amdgpu_ring_alloc(kiq_ring, (8 * adev->gfx.num_compute_rings) + 11); if (r) { DRM_ERROR("Failed to lock KIQ (%d).\n", r); amdgpu_gfx_scratch_free(adev, scratch); return r; } /* set resources */ amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_SET_RESOURCES, 6)); amdgpu_ring_write(kiq_ring, 0); /* vmid_mask:0 queue_type:0 (KIQ) */ - amdgpu_ring_write(kiq_ring, 0x00FF);/* queue mask lo */ - amdgpu_ring_write(kiq_ring, 0); /* queue mask hi */ + amdgpu_ring_write(kiq_ring, lower_32_bits(queue_mask)); /* queue mask lo */ + amdgpu_ring_write(kiq_ring, upper_32_bits(queue_mask)); /* queue mask hi */ amdgpu_ring_write(kiq_ring, 0); /* gws mask lo */ amdgpu_ring_write(kiq_ring, 0); /* gws mask hi */ amdgpu_ring_write(kiq_ring, 0); /* oac mask */ amdgpu_ring_write(kiq_ring, 0); /* gds heap base:0, gds heap size:0 */ for (i = 0; i < adev->gfx.num_compute_rings; i++) { struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; uint64_t mqd_addr = amdgpu_bo_gpu_offset(ring->mqd_obj); uint64_t wptr_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4); /* map queues */ amdgpu_ring_write(kiq_ring, PACKET3(PACKET3_MAP_QUEUES, 5)); /* Q_sel:0, vmid:0, vidmem: 1, engine:0, num_Q:1*/ amdgpu_ring_write(kiq_ring, PACKET3_MAP_QUEUES_NUM_QUEUES(1)); amdgpu_ring_write(kiq_ring, PACKET3_MAP_QUEUES_DOORBELL_OFFSET(ring->doorbell_index) | PACKET3_MAP_QUEUES_QUEUE(ring->queue) | PACKET3_MAP_QUEUES_PIPE(ring->pipe) | PACKET3_MAP_QUEUES_ME(ring->me == 1 ? 0 : 1)); /* doorbell */ amdgpu_ring_write(kiq_ring, lower_32_bits(mqd_addr)); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 6208493..5a5ff47 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -1895,46 +1895,65 @@ static int gfx_v9_0_cp_compute_resume(struct amdgpu_device *adev) return 0; } /* KIQ functions */ static void gfx_v9_0_kiq_setting(struct amdgpu_ring *ring) { uint32_t tmp; struct amdgpu_device *adev = ring->adev; /* tell RLC which is KIQ queue */ tmp = RREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS); tmp &= 0xff00; tmp |= (ring->me << 5) | (ring->pipe << 3) | (ring->queue); WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp); tmp |= 0x80; WREG32_SOC15(GC, 0, mmRLC_CP_SCHEDULERS, tmp); } static void gfx_v9_0_kiq_en
[PATCH 12/17] drm/amdkfd: allow split HQD on per-queue granularity v4
Update the KGD to KFD interface to allow sharing pipes with queue granularity instead of pipe granularity. This allows for more interesting pipe/queue splits. v2: fix overflow check for res.queue_mask v3: fix shift overflow when setting res.queue_mask v4: fix comment in is_pipeline_enabled() Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 22 - drivers/gpu/drm/amd/amdkfd/kfd_device.c| 4 + .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 100 ++--- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 10 +-- .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c| 3 +- .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +- drivers/gpu/drm/amd/include/kgd_kfd_interface.h| 17 ++-- drivers/gpu/drm/radeon/radeon_kfd.c| 21 - 9 files changed, 126 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 3200ff9..8fc5aa3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -78,48 +78,64 @@ bool amdgpu_amdkfd_load_interface(struct amdgpu_device *adev) return true; } void amdgpu_amdkfd_fini(void) { if (kgd2kfd) { kgd2kfd->exit(); symbol_put(kgd2kfd_init); } } void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev) { if (kgd2kfd) adev->kfd = kgd2kfd->probe((struct kgd_dev *)adev, adev->pdev, kfd2kgd); } void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) { + int i; + int last_valid_bit; if (adev->kfd) { struct kgd2kfd_shared_resources gpu_resources = { .compute_vmid_bitmap = 0xFF00, - - .first_compute_pipe = 1, - .compute_pipe_count = 4 - 1, + .num_mec = adev->gfx.mec.num_mec, + .num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec, + .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe }; + /* this is going to have a few of the MSBs set that we need to +* clear */ + bitmap_complement(gpu_resources.queue_bitmap, + adev->gfx.mec.queue_bitmap, + KGD_MAX_QUEUES); + + /* According to linux/bitmap.h we shouldn't use bitmap_clear if +* nbits is not compile time constant */ + last_valid_bit = adev->gfx.mec.num_mec + * adev->gfx.mec.num_pipe_per_mec + * adev->gfx.mec.num_queue_per_pipe; + for (i = last_valid_bit; i < KGD_MAX_QUEUES; ++i) + clear_bit(i, gpu_resources.queue_bitmap); + amdgpu_doorbell_get_kfd_info(adev, &gpu_resources.doorbell_physical_address, &gpu_resources.doorbell_aperture_size, &gpu_resources.doorbell_start_offset); kgd2kfd->device_init(adev->kfd, &gpu_resources); } } void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev) { if (adev->kfd) { kgd2kfd->device_exit(adev->kfd); adev->kfd = NULL; } } void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev, const void *ih_ring_entry) { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 3f95f7c..88187bf 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -209,40 +209,44 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid, pasid, address, flags); dev = kfd_device_by_pci_dev(pdev); BUG_ON(dev == NULL); kfd_signal_iommu_event(dev, pasid, address, flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC); return AMD_IOMMU_INV_PRI_RSP_INVALID; } bool kgd2kfd_device_init(struct kfd_dev *kfd, const struct kgd2kfd_shared_resources *gpu_resources) { unsigned int size; kfd->shared_resources = *gpu_resources; + /* We only use the first MEC */ + if (kfd->shared_resources.num_mec > 1) + kfd->shared_resources.num_mec = 1; + /* calculate max size of mqds needed for queues */ size = max_num_of_queues_per_device * kfd->device_info->mqd_size_aligned; /* * calculate max size of runlist packet. * There can be only 2 packets
[PATCH 05/17] drm/amdgpu: unify MQD programming sequence for kfd and amdgpu v2
Use the same gfx_*_mqd_commit function for kfd and amdgpu codepaths. This removes the last duplicates of this programming sequence. v2: fix cp_hqd_pq_wptr value Reviewed-by: Edward O'Callaghan Acked-by: Christian König Reviewed-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 51 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 49 ++ drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 38 - drivers/gpu/drm/amd/amdgpu/gfx_v7_0.h | 5 +++ drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 49 +++--- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.h | 5 +++ 6 files changed, 97 insertions(+), 100 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c index 1a0a5f7..038b7ea 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c @@ -12,40 +12,41 @@ * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include "amdgpu.h" #include "amdgpu_amdkfd.h" #include "cikd.h" #include "cik_sdma.h" #include "amdgpu_ucode.h" +#include "gfx_v7_0.h" #include "gca/gfx_7_2_d.h" #include "gca/gfx_7_2_enum.h" #include "gca/gfx_7_2_sh_mask.h" #include "oss/oss_2_0_d.h" #include "oss/oss_2_0_sh_mask.h" #include "gmc/gmc_7_1_d.h" #include "gmc/gmc_7_1_sh_mask.h" #include "cik_structs.h" #define CIK_PIPE_PER_MEC (4) enum { MAX_TRAPID = 8, /* 3 bits in the bitfield. */ MAX_WATCH_ADDRESSES = 4 }; enum { ADDRESS_WATCH_REG_ADDR_HI = 0, ADDRESS_WATCH_REG_ADDR_LO, ADDRESS_WATCH_REG_CNTL, @@ -292,89 +293,45 @@ static inline uint32_t get_sdma_base_addr(struct cik_sdma_rlc_registers *m) static inline struct cik_mqd *get_mqd(void *mqd) { return (struct cik_mqd *)mqd; } static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd) { return (struct cik_sdma_rlc_registers *)mqd; } static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id, uint32_t queue_id, uint32_t __user *wptr) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t wptr_shadow, is_wptr_shadow_valid; struct cik_mqd *m; m = get_mqd(mqd); is_wptr_shadow_valid = !get_user(wptr_shadow, wptr); - - acquire_queue(kgd, pipe_id, queue_id); - WREG32(mmCP_MQD_BASE_ADDR, m->cp_mqd_base_addr_lo); - WREG32(mmCP_MQD_BASE_ADDR_HI, m->cp_mqd_base_addr_hi); - WREG32(mmCP_MQD_CONTROL, m->cp_mqd_control); - - WREG32(mmCP_HQD_PQ_BASE, m->cp_hqd_pq_base_lo); - WREG32(mmCP_HQD_PQ_BASE_HI, m->cp_hqd_pq_base_hi); - WREG32(mmCP_HQD_PQ_CONTROL, m->cp_hqd_pq_control); - - WREG32(mmCP_HQD_IB_CONTROL, m->cp_hqd_ib_control); - WREG32(mmCP_HQD_IB_BASE_ADDR, m->cp_hqd_ib_base_addr_lo); - WREG32(mmCP_HQD_IB_BASE_ADDR_HI, m->cp_hqd_ib_base_addr_hi); - - WREG32(mmCP_HQD_IB_RPTR, m->cp_hqd_ib_rptr); - - WREG32(mmCP_HQD_PERSISTENT_STATE, m->cp_hqd_persistent_state); - WREG32(mmCP_HQD_SEMA_CMD, m->cp_hqd_sema_cmd); - WREG32(mmCP_HQD_MSG_TYPE, m->cp_hqd_msg_type); - - WREG32(mmCP_HQD_ATOMIC0_PREOP_LO, m->cp_hqd_atomic0_preop_lo); - WREG32(mmCP_HQD_ATOMIC0_PREOP_HI, m->cp_hqd_atomic0_preop_hi); - WREG32(mmCP_HQD_ATOMIC1_PREOP_LO, m->cp_hqd_atomic1_preop_lo); - WREG32(mmCP_HQD_ATOMIC1_PREOP_HI, m->cp_hqd_atomic1_preop_hi); - - WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR, m->cp_hqd_pq_rptr_report_addr_lo); - WREG32(mmCP_HQD_PQ_RPTR_REPORT_ADDR_HI, - m->cp_hqd_pq_rptr_report_addr_hi); - - WREG32(mmCP_HQD_PQ_RPTR, m->cp_hqd_pq_rptr); - - WREG32(mmCP_HQD_PQ_WPTR_POLL_ADDR, m->cp_hqd_pq_wptr_poll_addr_lo); - WREG32(mmCP_HQD_PQ_WPTR_POLL_ADDR_HI, m->cp_hqd_pq_wptr_poll_addr_hi); - - WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, m->cp_hqd_pq_doorbell_control); - - WREG32(mmCP_HQD_VMID, m->cp_hqd_vmid); - - WREG32(mmCP_HQD_QUANTUM, m->cp_hqd_quantum); - - WREG32(mmCP_HQD_PIPE_PRIORITY, m->cp_hqd_pipe_priority); - WREG32(mmCP_HQD_QUEUE_PRIORITY, m->cp_hqd_queue_priority); - - WREG32(mmCP_HQD_IQ_RPTR, m->cp_hqd_iq_rptr); - if (is_wptr_shadow_valid) - WREG32(mmCP_HQD_PQ_WPTR, wptr_shadow); +
[PATCH 17/17] drm/amdgpu: new queue policy, take first 2 queues of each pipe v2
Instead of taking the first pipe and giving the rest to kfd, take the first 2 queues of each pipe. Effectively, amdgpu and amdkfd own the same number of queues. But because the queues are spread over multiple pipes the hardware will be able to better handle concurrent compute workloads. amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4 amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4 v2: fix policy comment Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 684f053..c0844a5 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -2821,42 +2821,42 @@ static void gfx_v7_0_mec_fini(struct amdgpu_device *adev) adev->gfx.mec.hpd_eop_obj = NULL; } } static void gfx_v7_0_compute_queue_acquire(struct amdgpu_device *adev) { int i, queue, pipe, mec; /* policy for amdgpu compute queue ownership */ for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) { queue = i % adev->gfx.mec.num_queue_per_pipe; pipe = (i / adev->gfx.mec.num_queue_per_pipe) % adev->gfx.mec.num_pipe_per_mec; mec = (i / adev->gfx.mec.num_queue_per_pipe) / adev->gfx.mec.num_pipe_per_mec; /* we've run out of HW */ if (mec >= adev->gfx.mec.num_mec) break; - /* policy: amdgpu owns all queues in the first pipe */ - if (mec == 0 && pipe == 0) + /* policy: amdgpu owns the first two queues of the first MEC */ + if (mec == 0 && queue < 2) set_bit(i, adev->gfx.mec.queue_bitmap); } /* update the number of active compute rings */ adev->gfx.num_compute_rings = bitmap_weight(adev->gfx.mec.queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES); /* If you hit this case and edited the policy, you probably just * need to increase AMDGPU_MAX_COMPUTE_RINGS */ if (WARN_ON(adev->gfx.num_compute_rings > AMDGPU_MAX_COMPUTE_RINGS)) adev->gfx.num_compute_rings = AMDGPU_MAX_COMPUTE_RINGS; } static int gfx_v7_0_mec_init(struct amdgpu_device *adev) { int r; u32 *hpd; size_t mec_hpd_size; bitmap_zero(adev->gfx.mec.queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index 2178611..a5ba48b 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -1439,42 +1439,42 @@ static void gfx_v8_0_kiq_free_ring(struct amdgpu_ring *ring, amdgpu_wb_free(ring->adev, ring->adev->virt.reg_val_offs); amdgpu_ring_fini(ring); } static void gfx_v8_0_compute_queue_acquire(struct amdgpu_device *adev) { int i, queue, pipe, mec; /* policy for amdgpu compute queue ownership */ for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) { queue = i % adev->gfx.mec.num_queue_per_pipe; pipe = (i / adev->gfx.mec.num_queue_per_pipe) % adev->gfx.mec.num_pipe_per_mec; mec = (i / adev->gfx.mec.num_queue_per_pipe) / adev->gfx.mec.num_pipe_per_mec; /* we've run out of HW */ if (mec >= adev->gfx.mec.num_mec) break; - /* policy: amdgpu owns all queues in the first pipe */ - if (mec == 0 && pipe == 0) + /* policy: amdgpu owns the first two queues of the first MEC */ + if (mec == 0 && queue < 2) set_bit(i, adev->gfx.mec.queue_bitmap); } /* update the number of active compute rings */ adev->gfx.num_compute_rings = bitmap_weight(adev->gfx.mec.queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES); /* If you hit this case and edited the policy, you probably just * need to increase AMDGPU_MAX_COMPUTE_RINGS */ if (WARN_ON(adev->gfx.num_compute_rings > AMDGPU_MAX_COMPUTE_RINGS)) adev->gfx.num_compute_rings = AMDGPU_MAX_COMPUTE_RINGS; } static int gfx_v8_0_mec_init(struct amdgpu_device *adev) { int r; u32 *hpd; size_t mec_hpd_size; bitmap_zero(adev->gfx.mec.queue_bitmap, AMDGPU_MAX_COMPUTE_QUEUES); -- 2.9.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 14/17] drm/amdgpu: allocate queues horizontally across pipes
Pipes provide better concurrency than queues, therefore we want to make sure that apps use queues from different pipes whenever possible. Optimize for the trivial case where an app will consume rings in order, therefore we don't want adjacent rings to belong to the same pipe. Reviewed-by: Edward O'Callaghan Acked-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 13 ++ drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 83 +++-- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 86 +-- 3 files changed, 113 insertions(+), 69 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 61990be..0583396 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1762,40 +1762,53 @@ static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void *sr ring->count_dw -= count_dw; } } static inline struct amdgpu_sdma_instance * amdgpu_get_sdma_instance(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; int i; for (i = 0; i < adev->sdma.num_instances; i++) if (&adev->sdma.instance[i].ring == ring) break; if (i < AMDGPU_MAX_SDMA_INSTANCES) return &adev->sdma.instance[i]; else return NULL; } +static inline bool amdgpu_is_mec_queue_enabled(struct amdgpu_device *adev, + int mec, int pipe, int queue) +{ + int bit = 0; + + bit += mec * adev->gfx.mec.num_pipe_per_mec + * adev->gfx.mec.num_queue_per_pipe; + bit += pipe * adev->gfx.mec.num_queue_per_pipe; + bit += queue; + + return test_bit(bit, adev->gfx.mec.queue_bitmap); +} + /* * ASICs macro. */ #define amdgpu_asic_set_vga_state(adev, state) (adev)->asic_funcs->set_vga_state((adev), (state)) #define amdgpu_asic_reset(adev) (adev)->asic_funcs->reset((adev)) #define amdgpu_asic_get_xclk(adev) (adev)->asic_funcs->get_xclk((adev)) #define amdgpu_asic_set_uvd_clocks(adev, v, d) (adev)->asic_funcs->set_uvd_clocks((adev), (v), (d)) #define amdgpu_asic_set_vce_clocks(adev, ev, ec) (adev)->asic_funcs->set_vce_clocks((adev), (ev), (ec)) #define amdgpu_get_pcie_lanes(adev) (adev)->asic_funcs->get_pcie_lanes((adev)) #define amdgpu_set_pcie_lanes(adev, l) (adev)->asic_funcs->set_pcie_lanes((adev), (l)) #define amdgpu_asic_get_gpu_clock_counter(adev) (adev)->asic_funcs->get_gpu_clock_counter((adev)) #define amdgpu_asic_read_disabled_bios(adev) (adev)->asic_funcs->read_disabled_bios((adev)) #define amdgpu_asic_read_bios_from_rom(adev, b, l) (adev)->asic_funcs->read_bios_from_rom((adev), (b), (l)) #define amdgpu_asic_read_register(adev, se, sh, offset, v)((adev)->asic_funcs->read_register((adev), (se), (sh), (offset), (v))) #define amdgpu_asic_get_config_memsize(adev) (adev)->asic_funcs->get_config_memsize((adev)) #define amdgpu_gart_flush_gpu_tlb(adev, vmid) (adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid)) #define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) (adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags)) #define amdgpu_vm_copy_pte(adev, ib, pe, src, count) ((adev)->vm_manager.vm_pte_funcs->copy_pte((ib), (pe), (src), (count))) #define amdgpu_vm_write_pte(adev, ib, pe, value, count, incr) ((adev)->vm_manager.vm_pte_funcs->write_pte((ib), (pe), (value), (count), (incr))) #define amdgpu_vm_set_pte_pde(adev, ib, pe, addr, count, incr, flags) ((adev)->vm_manager.vm_pte_funcs->set_pte_pde((ib), (pe), (addr), (count), (incr), (flags))) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 8969c69..684f053 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -4733,45 +4733,76 @@ static void gfx_v7_0_gpu_early_init(struct amdgpu_device *adev) adev->gfx.config.num_gpus = 1; adev->gfx.config.multi_gpu_tile_size = 64; /* fix up row size */ gb_addr_config &= ~GB_ADDR_CONFIG__ROW_SIZE_MASK; switch (adev->gfx.config.mem_row_size_in_kb) { case 1: default: gb_addr_config |= (0 << GB_ADDR_CONFIG__ROW_SIZE__SHIFT); break; case 2: gb_addr_config |= (1 << GB_ADDR_CONFIG__ROW_SIZE__SHIFT); break; case 4: gb_addr_config |= (2 << GB_ADDR_CONFIG__ROW_SIZE__SHIFT); break; } adev->gfx.config.gb_addr_config = gb_addr_config; } +static int gfx_v7_0_compute_ring_init(struct amdgpu_device *adev, int ring_id, + int mec, int pipe, int queue) +{ + int r; + unsigned irq_type; + struct amdgpu_ring *ring = &adev->gfx.compute_ring[ring_id]; + +
[PATCH 11/17] drm/amdgpu: teach amdgpu how to enable interrupts for any pipe v3
The current implementation is hardcoded to enable ME1/PIPE0 interrupts only. This patch allows amdgpu to enable interrupts for any pipe of ME1. v2: added gfx9 support v3: use soc15_grbm_select for gfx9 Acked-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 48 - drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 33 +++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 50 +++ 3 files changed, 49 insertions(+), 82 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 8520b4b..8969c69 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -5047,76 +5047,62 @@ static void gfx_v7_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev, switch (state) { case AMDGPU_IRQ_STATE_DISABLE: cp_int_cntl = RREG32(mmCP_INT_CNTL_RING0); cp_int_cntl &= ~CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK; WREG32(mmCP_INT_CNTL_RING0, cp_int_cntl); break; case AMDGPU_IRQ_STATE_ENABLE: cp_int_cntl = RREG32(mmCP_INT_CNTL_RING0); cp_int_cntl |= CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK; WREG32(mmCP_INT_CNTL_RING0, cp_int_cntl); break; default: break; } } static void gfx_v7_0_set_compute_eop_interrupt_state(struct amdgpu_device *adev, int me, int pipe, enum amdgpu_interrupt_state state) { - u32 mec_int_cntl, mec_int_cntl_reg; - - /* -* amdgpu controls only pipe 0 of MEC1. That's why this function only -* handles the setting of interrupts for this specific pipe. All other -* pipes' interrupts are set by amdkfd. + /* Me 0 is for graphics and Me 2 is reserved for HW scheduling +* So we should only really be configuring ME 1 i.e. MEC0 */ - - if (me == 1) { - switch (pipe) { - case 0: - mec_int_cntl_reg = mmCP_ME1_PIPE0_INT_CNTL; - break; - default: - DRM_DEBUG("invalid pipe %d\n", pipe); - return; - } - } else { - DRM_DEBUG("invalid me %d\n", me); + if (me != 1) { + DRM_ERROR("Ignoring request to enable interrupts for invalid me:%d\n", me); return; } - switch (state) { - case AMDGPU_IRQ_STATE_DISABLE: - mec_int_cntl = RREG32(mec_int_cntl_reg); - mec_int_cntl &= ~CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK; - WREG32(mec_int_cntl_reg, mec_int_cntl); - break; - case AMDGPU_IRQ_STATE_ENABLE: - mec_int_cntl = RREG32(mec_int_cntl_reg); - mec_int_cntl |= CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK; - WREG32(mec_int_cntl_reg, mec_int_cntl); - break; - default: - break; + if (pipe >= adev->gfx.mec.num_pipe_per_mec) { + DRM_ERROR("Ignoring request to enable interrupts for invalid " + "me:%d pipe:%d\n", pipe, me); + return; } + + mutex_lock(&adev->srbm_mutex); + cik_srbm_select(adev, me, pipe, 0, 0); + + WREG32_FIELD(CPC_INT_CNTL, TIME_STAMP_INT_ENABLE, + state == AMDGPU_IRQ_STATE_DISABLE ? 0 : 1); + + cik_srbm_select(adev, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); } static int gfx_v7_0_set_priv_reg_fault_state(struct amdgpu_device *adev, struct amdgpu_irq_src *src, unsigned type, enum amdgpu_interrupt_state state) { u32 cp_int_cntl; switch (state) { case AMDGPU_IRQ_STATE_DISABLE: cp_int_cntl = RREG32(mmCP_INT_CNTL_RING0); cp_int_cntl &= ~CP_INT_CNTL_RING0__PRIV_REG_INT_ENABLE_MASK; WREG32(mmCP_INT_CNTL_RING0, cp_int_cntl); break; case AMDGPU_IRQ_STATE_ENABLE: cp_int_cntl = RREG32(mmCP_INT_CNTL_RING0); cp_int_cntl |= CP_INT_CNTL_RING0__PRIV_REG_INT_ENABLE_MASK; WREG32(mmCP_INT_CNTL_RING0, cp_int_cntl); break; diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index fc94e2b..8cc9874 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -6769,61 +6769,60 @@ static void gfx_v8_0_ring_emit_wreg(struct amdgpu_ring *ring, uint32_t reg, uint32_t val) { amdgpu_ring_write(ring
[PATCH 13/17] drm/amdgpu: remove duplicate magic constants from amdgpu_amdkfd_gfx*.c
This information is already available in adev. Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 12 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 12 ++-- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c index 910f9d3..5254562 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c @@ -22,42 +22,40 @@ #include #include #include #include #include "amdgpu.h" #include "amdgpu_amdkfd.h" #include "cikd.h" #include "cik_sdma.h" #include "amdgpu_ucode.h" #include "gfx_v7_0.h" #include "gca/gfx_7_2_d.h" #include "gca/gfx_7_2_enum.h" #include "gca/gfx_7_2_sh_mask.h" #include "oss/oss_2_0_d.h" #include "oss/oss_2_0_sh_mask.h" #include "gmc/gmc_7_1_d.h" #include "gmc/gmc_7_1_sh_mask.h" #include "cik_structs.h" -#define CIK_PIPE_PER_MEC (4) - enum { MAX_TRAPID = 8, /* 3 bits in the bitfield. */ MAX_WATCH_ADDRESSES = 4 }; enum { ADDRESS_WATCH_REG_ADDR_HI = 0, ADDRESS_WATCH_REG_ADDR_LO, ADDRESS_WATCH_REG_CNTL, ADDRESS_WATCH_REG_MAX }; /* not defined in the CI/KV reg file */ enum { ADDRESS_WATCH_REG_CNTL_ATC_BIT = 0x1000UL, ADDRESS_WATCH_REG_CNTL_DEFAULT_MASK = 0x00FF, ADDRESS_WATCH_REG_ADDLOW_MASK_EXTENSION = 0x0300, /* extend the mask to 26 bits to match the low address field */ ADDRESS_WATCH_REG_ADDLOW_SHIFT = 6, ADDRESS_WATCH_REG_ADDHIGH_MASK = 0x @@ -169,42 +167,44 @@ static void lock_srbm(struct kgd_dev *kgd, uint32_t mec, uint32_t pipe, uint32_t queue, uint32_t vmid) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t value = PIPEID(pipe) | MEID(mec) | VMID(vmid) | QUEUEID(queue); mutex_lock(&adev->srbm_mutex); WREG32(mmSRBM_GFX_CNTL, value); } static void unlock_srbm(struct kgd_dev *kgd) { struct amdgpu_device *adev = get_amdgpu_device(kgd); WREG32(mmSRBM_GFX_CNTL, 0); mutex_unlock(&adev->srbm_mutex); } static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id, uint32_t queue_id) { - uint32_t mec = (++pipe_id / CIK_PIPE_PER_MEC) + 1; - uint32_t pipe = (pipe_id % CIK_PIPE_PER_MEC); + struct amdgpu_device *adev = get_amdgpu_device(kgd); + + uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1; + uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec); lock_srbm(kgd, mec, pipe, queue_id, 0); } static void release_queue(struct kgd_dev *kgd) { unlock_srbm(kgd); } static void kgd_program_sh_mem_settings(struct kgd_dev *kgd, uint32_t vmid, uint32_t sh_mem_config, uint32_t sh_mem_ape1_base, uint32_t sh_mem_ape1_limit, uint32_t sh_mem_bases) { struct amdgpu_device *adev = get_amdgpu_device(kgd); lock_srbm(kgd, 0, 0, 0, vmid); WREG32(mmSH_MEM_CONFIG, sh_mem_config); @@ -237,42 +237,42 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid, /* Mapping vmid to pasid also for IH block */ WREG32(mmIH_VMID_0_LUT + vmid, pasid_mapping); return 0; } static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id, uint32_t hpd_size, uint64_t hpd_gpu_addr) { /* amdgpu owns the per-pipe state */ return 0; } static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t mec; uint32_t pipe; - mec = (pipe_id / CIK_PIPE_PER_MEC) + 1; - pipe = (pipe_id % CIK_PIPE_PER_MEC); + mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1; + pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec); lock_srbm(kgd, mec, pipe, 0, 0); WREG32(mmCPC_INT_CNTL, CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK | CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK); unlock_srbm(kgd); return 0; } static inline uint32_t get_sdma_base_addr(struct cik_sdma_rlc_registers *m) { uint32_t retval; retval = m->sdma_engine_id * SDMA1_REGISTER_OFFSET + m->sdma_queue_id * KFD_CIK_SDMA_QUEUE_OFFSET; pr_debug("kfd: sdma base address: 0x%x\n", retval); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c index 6ba94e9..133d066 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_g
[PATCH 16/17] drm/amdgpu: avoid KIQ clashing with compute or KFD queues v2
Instead of picking an arbitrary queue for KIQ, search for one according to policy. The queue must be unused. Also report the KIQ as an unavailable resource to KFD. In testing I ran into KCQ initialization issues when using pipes 2/3 of MEC2 for the KIQ. Therefore the policy disallows grabbing one of these. v2: fix (ring.me + 1) to (ring.me -1) in amdgpu_amdkfd_device_init Reviewed-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 23 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 ++ drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 43 -- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 42 - 4 files changed, 98 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 0583396..0a58575 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1762,51 +1762,68 @@ static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void *sr ring->count_dw -= count_dw; } } static inline struct amdgpu_sdma_instance * amdgpu_get_sdma_instance(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; int i; for (i = 0; i < adev->sdma.num_instances; i++) if (&adev->sdma.instance[i].ring == ring) break; if (i < AMDGPU_MAX_SDMA_INSTANCES) return &adev->sdma.instance[i]; else return NULL; } -static inline bool amdgpu_is_mec_queue_enabled(struct amdgpu_device *adev, - int mec, int pipe, int queue) +static inline int amdgpu_queue_to_bit(struct amdgpu_device *adev, + int mec, int pipe, int queue) { int bit = 0; bit += mec * adev->gfx.mec.num_pipe_per_mec * adev->gfx.mec.num_queue_per_pipe; bit += pipe * adev->gfx.mec.num_queue_per_pipe; bit += queue; - return test_bit(bit, adev->gfx.mec.queue_bitmap); + return bit; +} + +static inline void amdgpu_bit_to_queue(struct amdgpu_device *adev, int bit, + int *mec, int *pipe, int *queue) +{ + *queue = bit % adev->gfx.mec.num_queue_per_pipe; + *pipe = (bit / adev->gfx.mec.num_queue_per_pipe) + % adev->gfx.mec.num_pipe_per_mec; + *mec = (bit / adev->gfx.mec.num_queue_per_pipe) + / adev->gfx.mec.num_pipe_per_mec; + +} +static inline bool amdgpu_is_mec_queue_enabled(struct amdgpu_device *adev, + int mec, int pipe, int queue) +{ + return test_bit(amdgpu_queue_to_bit(adev, mec, pipe, queue), + adev->gfx.mec.queue_bitmap); } /* * ASICs macro. */ #define amdgpu_asic_set_vga_state(adev, state) (adev)->asic_funcs->set_vga_state((adev), (state)) #define amdgpu_asic_reset(adev) (adev)->asic_funcs->reset((adev)) #define amdgpu_asic_get_xclk(adev) (adev)->asic_funcs->get_xclk((adev)) #define amdgpu_asic_set_uvd_clocks(adev, v, d) (adev)->asic_funcs->set_uvd_clocks((adev), (v), (d)) #define amdgpu_asic_set_vce_clocks(adev, ev, ec) (adev)->asic_funcs->set_vce_clocks((adev), (ev), (ec)) #define amdgpu_get_pcie_lanes(adev) (adev)->asic_funcs->get_pcie_lanes((adev)) #define amdgpu_set_pcie_lanes(adev, l) (adev)->asic_funcs->set_pcie_lanes((adev), (l)) #define amdgpu_asic_get_gpu_clock_counter(adev) (adev)->asic_funcs->get_gpu_clock_counter((adev)) #define amdgpu_asic_read_disabled_bios(adev) (adev)->asic_funcs->read_disabled_bios((adev)) #define amdgpu_asic_read_bios_from_rom(adev, b, l) (adev)->asic_funcs->read_bios_from_rom((adev), (b), (l)) #define amdgpu_asic_read_register(adev, se, sh, offset, v)((adev)->asic_funcs->read_register((adev), (se), (sh), (offset), (v))) #define amdgpu_asic_get_config_memsize(adev) (adev)->asic_funcs->get_config_memsize((adev)) #define amdgpu_gart_flush_gpu_tlb(adev, vmid) (adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid)) #define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) (adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags)) #define amdgpu_vm_copy_pte(adev, ib, pe, src, count) ((adev)->vm_manager.vm_pte_funcs->copy_pte((ib), (pe), (src), (count))) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 8fc5aa3..339e8cd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -94,40 +94,48 @@ void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev) } void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) { int i; int last_valid_bit; if (adev->kfd) { struct kgd2kfd_shared_resources gpu_resources = { .compute_vmid_bitmap = 0xFF00,
[PATCH 06/17] drm/amdgpu: fix kgd_hqd_load failing to update shadow_wptr
The return value from copy_form_user is 0 for the success case. Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c index f9ad534..8af2975 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c @@ -235,41 +235,41 @@ static inline uint32_t get_sdma_base_addr(struct cik_sdma_rlc_registers *m) static inline struct vi_mqd *get_mqd(void *mqd) { return (struct vi_mqd *)mqd; } static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd) { return (struct cik_sdma_rlc_registers *)mqd; } static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id, uint32_t queue_id, uint32_t __user *wptr) { struct vi_mqd *m; uint32_t shadow_wptr, valid_wptr; struct amdgpu_device *adev = get_amdgpu_device(kgd); m = get_mqd(mqd); valid_wptr = copy_from_user(&shadow_wptr, wptr, sizeof(shadow_wptr)); - if (valid_wptr > 0) + if (valid_wptr == 0) m->cp_hqd_pq_wptr = shadow_wptr; acquire_queue(kgd, pipe_id, queue_id); gfx_v8_0_mqd_commit(adev, mqd); release_queue(kgd); return 0; } static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd) { return 0; } static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address, uint32_t pipe_id, uint32_t queue_id) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t act; bool retval = false; -- 2.9.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 07/17] drm/amdgpu: rename rdev to adev
Rename straggler instances of r(adeon)dev to a(mdgpu)dev Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 70 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 14 +++--- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 2 +- 4 files changed, 44 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index dba8a5b..3200ff9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -43,204 +43,204 @@ int amdgpu_amdkfd_init(void) return -ENOENT; ret = kgd2kfd_init_p(KFD_INTERFACE_VERSION, &kgd2kfd); if (ret) { symbol_put(kgd2kfd_init); kgd2kfd = NULL; } #elif defined(CONFIG_HSA_AMD) ret = kgd2kfd_init(KFD_INTERFACE_VERSION, &kgd2kfd); if (ret) kgd2kfd = NULL; #else ret = -ENOENT; #endif return ret; } -bool amdgpu_amdkfd_load_interface(struct amdgpu_device *rdev) +bool amdgpu_amdkfd_load_interface(struct amdgpu_device *adev) { - switch (rdev->asic_type) { + switch (adev->asic_type) { #ifdef CONFIG_DRM_AMDGPU_CIK case CHIP_KAVERI: kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions(); break; #endif case CHIP_CARRIZO: kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions(); break; default: return false; } return true; } void amdgpu_amdkfd_fini(void) { if (kgd2kfd) { kgd2kfd->exit(); symbol_put(kgd2kfd_init); } } -void amdgpu_amdkfd_device_probe(struct amdgpu_device *rdev) +void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev) { if (kgd2kfd) - rdev->kfd = kgd2kfd->probe((struct kgd_dev *)rdev, - rdev->pdev, kfd2kgd); + adev->kfd = kgd2kfd->probe((struct kgd_dev *)adev, + adev->pdev, kfd2kgd); } -void amdgpu_amdkfd_device_init(struct amdgpu_device *rdev) +void amdgpu_amdkfd_device_init(struct amdgpu_device *adev) { - if (rdev->kfd) { + if (adev->kfd) { struct kgd2kfd_shared_resources gpu_resources = { .compute_vmid_bitmap = 0xFF00, .first_compute_pipe = 1, .compute_pipe_count = 4 - 1, }; - amdgpu_doorbell_get_kfd_info(rdev, + amdgpu_doorbell_get_kfd_info(adev, &gpu_resources.doorbell_physical_address, &gpu_resources.doorbell_aperture_size, &gpu_resources.doorbell_start_offset); - kgd2kfd->device_init(rdev->kfd, &gpu_resources); + kgd2kfd->device_init(adev->kfd, &gpu_resources); } } -void amdgpu_amdkfd_device_fini(struct amdgpu_device *rdev) +void amdgpu_amdkfd_device_fini(struct amdgpu_device *adev) { - if (rdev->kfd) { - kgd2kfd->device_exit(rdev->kfd); - rdev->kfd = NULL; + if (adev->kfd) { + kgd2kfd->device_exit(adev->kfd); + adev->kfd = NULL; } } -void amdgpu_amdkfd_interrupt(struct amdgpu_device *rdev, +void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev, const void *ih_ring_entry) { - if (rdev->kfd) - kgd2kfd->interrupt(rdev->kfd, ih_ring_entry); + if (adev->kfd) + kgd2kfd->interrupt(adev->kfd, ih_ring_entry); } -void amdgpu_amdkfd_suspend(struct amdgpu_device *rdev) +void amdgpu_amdkfd_suspend(struct amdgpu_device *adev) { - if (rdev->kfd) - kgd2kfd->suspend(rdev->kfd); + if (adev->kfd) + kgd2kfd->suspend(adev->kfd); } -int amdgpu_amdkfd_resume(struct amdgpu_device *rdev) +int amdgpu_amdkfd_resume(struct amdgpu_device *adev) { int r = 0; - if (rdev->kfd) - r = kgd2kfd->resume(rdev->kfd); + if (adev->kfd) + r = kgd2kfd->resume(adev->kfd); return r; } int alloc_gtt_mem(struct kgd_dev *kgd, size_t size, void **mem_obj, uint64_t *gpu_addr, void **cpu_ptr) { - struct amdgpu_device *rdev = (struct amdgpu_device *)kgd; + struct amdgpu_device *adev = (struct amdgpu_device *)kgd; struct kgd_mem **mem = (struct kgd_mem **) mem_obj; int r; BUG_ON(kgd == NULL); BUG_ON(gpu_addr == NULL); BUG_ON(cpu_ptr == NULL); *mem = kmalloc(sizeof(struct kgd_mem), GFP_KERNEL); if ((*mem) == NULL) return -ENOMEM; - r =
[PATCH 08/17] drm/radeon: take ownership of pipe initialization
Take ownership of pipe initialization away from KFD. Note that hpd_eop_gpu_addr was already large enough to accomodate all pipes. Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/radeon/cik.c| 27 ++- drivers/gpu/drm/radeon/radeon_kfd.c | 13 + 2 files changed, 15 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 53710dd..3d084c2 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -4563,57 +4563,58 @@ static int cik_cp_compute_resume(struct radeon_device *rdev) bool use_doorbell = true; u64 hqd_gpu_addr; u64 mqd_gpu_addr; u64 eop_gpu_addr; u64 wb_gpu_addr; u32 *buf; struct bonaire_mqd *mqd; r = cik_cp_compute_start(rdev); if (r) return r; /* fix up chicken bits */ tmp = RREG32(CP_CPF_DEBUG); tmp |= (1 << 23); WREG32(CP_CPF_DEBUG, tmp); /* init the pipes */ mutex_lock(&rdev->srbm_mutex); - eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr; + for (i = 0; i < rdev->mec.num_pipe; ++i) { + cik_srbm_select(rdev, 0, i, 0, 0); - cik_srbm_select(rdev, 0, 0, 0, 0); - - /* write the EOP addr */ - WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); - WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); + eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i * MEC_HPD_SIZE * 2) ; + /* write the EOP addr */ + WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); + WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); - /* set the VMID assigned */ - WREG32(CP_HPD_EOP_VMID, 0); + /* set the VMID assigned */ + WREG32(CP_HPD_EOP_VMID, 0); - /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ - tmp = RREG32(CP_HPD_EOP_CONTROL); - tmp &= ~EOP_SIZE_MASK; - tmp |= order_base_2(MEC_HPD_SIZE / 8); - WREG32(CP_HPD_EOP_CONTROL, tmp); + /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ + tmp = RREG32(CP_HPD_EOP_CONTROL); + tmp &= ~EOP_SIZE_MASK; + tmp |= order_base_2(MEC_HPD_SIZE / 8); + WREG32(CP_HPD_EOP_CONTROL, tmp); + } mutex_unlock(&rdev->srbm_mutex); /* init the queues. Just two for now. */ for (i = 0; i < 2; i++) { if (i == 0) idx = CAYMAN_RING_TYPE_CP1_INDEX; else idx = CAYMAN_RING_TYPE_CP2_INDEX; if (rdev->ring[idx].mqd_obj == NULL) { r = radeon_bo_create(rdev, sizeof(struct bonaire_mqd), PAGE_SIZE, true, RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL, &rdev->ring[idx].mqd_obj); if (r) { dev_warn(rdev->dev, "(%d) create MQD bo failed\n", r); return r; } } diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 87a9ebb..a06e3b1 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -406,52 +406,41 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid, ATC_VMID_PASID_MAPPING_VALID_MASK; write_register(kgd, ATC_VMID0_PASID_MAPPING + vmid*sizeof(uint32_t), pasid_mapping); while (!(read_register(kgd, ATC_VMID_PASID_MAPPING_UPDATE_STATUS) & (1U << vmid))) cpu_relax(); write_register(kgd, ATC_VMID_PASID_MAPPING_UPDATE_STATUS, 1U << vmid); /* Mapping vmid to pasid also for IH block */ write_register(kgd, IH_VMID_0_LUT + vmid * sizeof(uint32_t), pasid_mapping); return 0; } static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id, uint32_t hpd_size, uint64_t hpd_gpu_addr) { - uint32_t mec = (pipe_id / CIK_PIPE_PER_MEC) + 1; - uint32_t pipe = (pipe_id % CIK_PIPE_PER_MEC); - - lock_srbm(kgd, mec, pipe, 0, 0); - write_register(kgd, CP_HPD_EOP_BASE_ADDR, - lower_32_bits(hpd_gpu_addr >> 8)); - write_register(kgd, CP_HPD_EOP_BASE_ADDR_HI, - upper_32_bits(hpd_gpu_addr >> 8)); - write_register(kgd, CP_HPD_EOP_VMID, 0); - write_register(kgd, CP_HPD_EOP_CONTROL, hpd_size); - unlock_srbm(kgd);
[PATCH 09/17] drm/amdgpu: take ownership of per-pipe configuration v2
Make amdgpu the owner of all per-pipe state of the HQDs. This change will allow us to split the queues between kfd and amdgpu with a queue granularity instead of pipe granularity. This patch fixes kfd allocating an HDP_EOP region for its 3 pipes which goes unused. v2: support for gfx9 Reviewed-by: Edward O'Callaghan Reviewed-by: Felix Kuehling Acked-by: Christian König Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 13 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 28 ++ drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 33 +++- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 24 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 45 -- 7 files changed, 65 insertions(+), 83 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 3abd2dc..6b294d2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -880,43 +880,43 @@ struct amdgpu_rlc { /* for firmware data */ u32 save_and_restore_offset; u32 clear_state_descriptor_offset; u32 avail_scratch_ram_locations; u32 reg_restore_list_size; u32 reg_list_format_start; u32 reg_list_format_separate_start; u32 starting_offsets_start; u32 reg_list_format_size_bytes; u32 reg_list_size_bytes; u32 *register_list_format; u32 *register_restore; }; struct amdgpu_mec { struct amdgpu_bo*hpd_eop_obj; u64 hpd_eop_gpu_addr; struct amdgpu_bo*mec_fw_obj; u64 mec_fw_gpu_addr; - u32 num_pipe; u32 num_mec; - u32 num_queue; + u32 num_pipe_per_mec; + u32 num_queue_per_pipe; void*mqd_backup[AMDGPU_MAX_COMPUTE_RINGS + 1]; }; struct amdgpu_kiq { u64 eop_gpu_addr; struct amdgpu_bo*eop_obj; struct amdgpu_ring ring; struct amdgpu_irq_src irq; }; /* * GPU scratch registers structures, functions & helpers */ struct amdgpu_scratch { unsignednum_reg; uint32_treg_base; uint32_tfree_mask; }; /* diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c index 038b7ea..910f9d3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c @@ -227,52 +227,41 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid, * SW cleared it. So the protocol is to always wait & clear. */ uint32_t pasid_mapping = (pasid == 0) ? 0 : (uint32_t)pasid | ATC_VMID0_PASID_MAPPING__VALID_MASK; WREG32(mmATC_VMID0_PASID_MAPPING + vmid, pasid_mapping); while (!(RREG32(mmATC_VMID_PASID_MAPPING_UPDATE_STATUS) & (1U << vmid))) cpu_relax(); WREG32(mmATC_VMID_PASID_MAPPING_UPDATE_STATUS, 1U << vmid); /* Mapping vmid to pasid also for IH block */ WREG32(mmIH_VMID_0_LUT + vmid, pasid_mapping); return 0; } static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id, uint32_t hpd_size, uint64_t hpd_gpu_addr) { - struct amdgpu_device *adev = get_amdgpu_device(kgd); - - uint32_t mec = (++pipe_id / CIK_PIPE_PER_MEC) + 1; - uint32_t pipe = (pipe_id % CIK_PIPE_PER_MEC); - - lock_srbm(kgd, mec, pipe, 0, 0); - WREG32(mmCP_HPD_EOP_BASE_ADDR, lower_32_bits(hpd_gpu_addr >> 8)); - WREG32(mmCP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(hpd_gpu_addr >> 8)); - WREG32(mmCP_HPD_EOP_VMID, 0); - WREG32(mmCP_HPD_EOP_CONTROL, hpd_size); - unlock_srbm(kgd); - + /* amdgpu owns the per-pipe state */ return 0; } static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id) { struct amdgpu_device *adev = get_amdgpu_device(kgd); uint32_t mec; uint32_t pipe; mec = (pipe_id / CIK_PIPE_PER_MEC) + 1; pipe = (pipe_id % CIK_PIPE_PER_MEC); lock_srbm(kgd, mec, pipe, 0, 0); WREG32(mmCPC_INT_CNTL, CP_INT_CNTL_RING0__TIME_STAMP_INT_ENABLE_MASK | CP_INT_CNTL_RING0__OPCODE_ERROR_INT_ENABLE_MASK); unlock_srbm(kgd); return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c index 8af2975..6ba94e9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c @@ -189,40 +189,41 @@ static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsi
[PATCH 02/17] drm/amdgpu: refactor MQD/HQD initialization v3
The MQD programming sequence currently exists in 3 different places. Refactor it to absorb all the duplicates. The success path remains mostly identical except for a slightly different order in the non-kiq case. This shouldn't matter if the HQD is disabled. The error handling paths have been updated to deal with the new code structure. v2: the non-kiq path for gfxv8 was dropped in the rebase v3: split MEC_HPD_SIZE rename, dropped doorbell changes Reviewed-by: Edward O'Callaghan Acked-by: Christian König Acked-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 439 ++ drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 78 +++--- 2 files changed, 271 insertions(+), 246 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 3b98162..4e6a60c 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -2927,281 +2927,316 @@ struct bonaire_mqd u32 perf_counter_enable; u32 pgm[2]; u32 tba[2]; u32 tma[2]; u32 pgm_rsrc[2]; u32 vmid; u32 resource_limits; u32 static_thread_mgmt01[2]; u32 tmp_ring_size; u32 static_thread_mgmt23[2]; u32 restart[3]; u32 thread_trace_enable; u32 reserved1; u32 user_data[16]; u32 vgtcs_invoke_count[2]; struct hqd_registers queue_state; u32 dequeue_cntr; u32 interrupt_queue[64]; }; -/** - * gfx_v7_0_cp_compute_resume - setup the compute queue registers - * - * @adev: amdgpu_device pointer - * - * Program the compute queues and test them to make sure they - * are working. - * Returns 0 for success, error for failure. - */ -static int gfx_v7_0_cp_compute_resume(struct amdgpu_device *adev) +static void gfx_v7_0_compute_pipe_init(struct amdgpu_device *adev, int me, int pipe) { - int r, i, j; - u32 tmp; - bool use_doorbell = true; - u64 hqd_gpu_addr; - u64 mqd_gpu_addr; u64 eop_gpu_addr; - u64 wb_gpu_addr; - u32 *buf; - struct bonaire_mqd *mqd; - struct amdgpu_ring *ring; - - /* fix up chicken bits */ - tmp = RREG32(mmCP_CPF_DEBUG); - tmp |= (1 << 23); - WREG32(mmCP_CPF_DEBUG, tmp); + u32 tmp; + size_t eop_offset = me * pipe * GFX7_MEC_HPD_SIZE * 2; - /* init the pipes */ mutex_lock(&adev->srbm_mutex); - for (i = 0; i < (adev->gfx.mec.num_pipe * adev->gfx.mec.num_mec); i++) { - int me = (i < 4) ? 1 : 2; - int pipe = (i < 4) ? i : (i - 4); + eop_gpu_addr = adev->gfx.mec.hpd_eop_gpu_addr + eop_offset; - eop_gpu_addr = adev->gfx.mec.hpd_eop_gpu_addr + (i * GFX7_MEC_HPD_SIZE * 2); + cik_srbm_select(adev, me, pipe, 0, 0); - cik_srbm_select(adev, me, pipe, 0, 0); + /* write the EOP addr */ + WREG32(mmCP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); + WREG32(mmCP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); - /* write the EOP addr */ - WREG32(mmCP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); - WREG32(mmCP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); + /* set the VMID assigned */ + WREG32(mmCP_HPD_EOP_VMID, 0); - /* set the VMID assigned */ - WREG32(mmCP_HPD_EOP_VMID, 0); + /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ + tmp = RREG32(mmCP_HPD_EOP_CONTROL); + tmp &= ~CP_HPD_EOP_CONTROL__EOP_SIZE_MASK; + tmp |= order_base_2(GFX7_MEC_HPD_SIZE / 8); + WREG32(mmCP_HPD_EOP_CONTROL, tmp); - /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ - tmp = RREG32(mmCP_HPD_EOP_CONTROL); - tmp &= ~CP_HPD_EOP_CONTROL__EOP_SIZE_MASK; - tmp |= order_base_2(GFX7_MEC_HPD_SIZE / 8); - WREG32(mmCP_HPD_EOP_CONTROL, tmp); - } cik_srbm_select(adev, 0, 0, 0, 0); mutex_unlock(&adev->srbm_mutex); +} - /* init the queues. Just two for now. */ - for (i = 0; i < adev->gfx.num_compute_rings; i++) { - ring = &adev->gfx.compute_ring[i]; +static int gfx_v7_0_mqd_deactivate(struct amdgpu_device *adev) +{ + int i; - if (ring->mqd_obj == NULL) { - r = amdgpu_bo_create(adev, -sizeof(struct bonaire_mqd), -PAGE_SIZE, true, -AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL, -&ring->mqd_obj); - if (r) { - dev_warn(adev->dev, "(%d) create MQD bo failed\n", r); - return r; - } + /* disable the queue if it's active */ + if (RREG32(mm
[PATCH 03/17] drm/amdgpu: detect timeout error when deactivating hqd
Handle HQD deactivation timeouts instead of ignoring them. Reviewed-by: Edward O'Callaghan Acked-by: Christian König Acked-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c index b670302..cd1af26 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c @@ -4947,75 +4947,89 @@ static int gfx_v8_0_mqd_commit(struct amdgpu_ring *ring) /* enable the doorbell if requested */ WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, mqd->cp_hqd_pq_doorbell_control); /* reset read and write pointers, similar to CP_RB0_WPTR/_RPTR */ WREG32(mmCP_HQD_PQ_WPTR, mqd->cp_hqd_pq_wptr); /* set the vmid for the queue */ WREG32(mmCP_HQD_VMID, mqd->cp_hqd_vmid); WREG32(mmCP_HQD_PERSISTENT_STATE, mqd->cp_hqd_persistent_state); /* activate the queue */ WREG32(mmCP_HQD_ACTIVE, mqd->cp_hqd_active); return 0; } static int gfx_v8_0_kiq_init_queue(struct amdgpu_ring *ring) { + int r = 0; struct amdgpu_device *adev = ring->adev; struct vi_mqd *mqd = ring->mqd_ptr; int mqd_idx = AMDGPU_MAX_COMPUTE_RINGS; gfx_v8_0_kiq_setting(ring); if (adev->gfx.in_reset) { /* for GPU_RESET case */ /* reset MQD to a clean status */ if (adev->gfx.mec.mqd_backup[mqd_idx]) memcpy(mqd, adev->gfx.mec.mqd_backup[mqd_idx], sizeof(*mqd)); /* reset ring buffer */ ring->wptr = 0; amdgpu_ring_clear_ring(ring); mutex_lock(&adev->srbm_mutex); vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0); - gfx_v8_0_deactivate_hqd(adev, 1); + r = gfx_v8_0_deactivate_hqd(adev, 1); + if (r) { + dev_err(adev->dev, "failed to deactivate ring %s\n", ring->name); + goto out_unlock; + } gfx_v8_0_mqd_commit(ring); vi_srbm_select(adev, 0, 0, 0, 0); mutex_unlock(&adev->srbm_mutex); } else { mutex_lock(&adev->srbm_mutex); vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0); gfx_v8_0_mqd_init(ring); - gfx_v8_0_deactivate_hqd(adev, 1); + r = gfx_v8_0_deactivate_hqd(adev, 1); + if (r) { + dev_err(adev->dev, "failed to deactivate ring %s\n", ring->name); + goto out_unlock; + } gfx_v8_0_mqd_commit(ring); vi_srbm_select(adev, 0, 0, 0, 0); mutex_unlock(&adev->srbm_mutex); if (adev->gfx.mec.mqd_backup[mqd_idx]) memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } - return 0; + return r; + +out_unlock: + vi_srbm_select(adev, 0, 0, 0, 0); + mutex_unlock(&adev->srbm_mutex); + return r; } static int gfx_v8_0_kcq_init_queue(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; struct vi_mqd *mqd = ring->mqd_ptr; int mqd_idx = ring - &adev->gfx.compute_ring[0]; if (!adev->gfx.in_reset && !adev->gfx.in_suspend) { mutex_lock(&adev->srbm_mutex); vi_srbm_select(adev, ring->me, ring->pipe, ring->queue, 0); gfx_v8_0_mqd_init(ring); vi_srbm_select(adev, 0, 0, 0, 0); mutex_unlock(&adev->srbm_mutex); if (adev->gfx.mec.mqd_backup[mqd_idx]) memcpy(adev->gfx.mec.mqd_backup[mqd_idx], mqd, sizeof(*mqd)); } else if (adev->gfx.in_reset) { /* for GPU_RESET case */ /* reset MQD to a clean status */ if (adev->gfx.mec.mqd_backup[mqd_idx]) -- 2.9.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 04/17] drm/amdgpu: remove duplicate definition of cik_mqd
The gfxv7 contains a slightly different version of cik_mqd called bonaire_mqd. This can introduce subtle bugs if fixes are not applied in both places. Reviewed-by: Edward O'Callaghan Acked-by: Christian König Acked-by: Felix Kuehling Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 135 ++ 1 file changed, 54 insertions(+), 81 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index 4e6a60c..c408af5 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -10,40 +10,41 @@ * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. * */ #include #include "drmP.h" #include "amdgpu.h" #include "amdgpu_ih.h" #include "amdgpu_gfx.h" #include "cikd.h" #include "cik.h" +#include "cik_structs.h" #include "atom.h" #include "amdgpu_ucode.h" #include "clearstate_ci.h" #include "dce/dce_8_0_d.h" #include "dce/dce_8_0_sh_mask.h" #include "bif/bif_4_1_d.h" #include "bif/bif_4_1_sh_mask.h" #include "gca/gfx_7_0_d.h" #include "gca/gfx_7_2_enum.h" #include "gca/gfx_7_2_sh_mask.h" #include "gmc/gmc_7_0_d.h" #include "gmc/gmc_7_0_sh_mask.h" #include "oss/oss_2_0_d.h" #include "oss/oss_2_0_sh_mask.h" @@ -2899,68 +2900,40 @@ struct hqd_registers u32 cp_hqd_pq_control; u32 cp_hqd_ib_base_addr; u32 cp_hqd_ib_base_addr_hi; u32 cp_hqd_ib_rptr; u32 cp_hqd_ib_control; u32 cp_hqd_iq_timer; u32 cp_hqd_iq_rptr; u32 cp_hqd_dequeue_request; u32 cp_hqd_dma_offload; u32 cp_hqd_sema_cmd; u32 cp_hqd_msg_type; u32 cp_hqd_atomic0_preop_lo; u32 cp_hqd_atomic0_preop_hi; u32 cp_hqd_atomic1_preop_lo; u32 cp_hqd_atomic1_preop_hi; u32 cp_hqd_hq_scheduler0; u32 cp_hqd_hq_scheduler1; u32 cp_mqd_control; }; -struct bonaire_mqd -{ - u32 header; - u32 dispatch_initiator; - u32 dimensions[3]; - u32 start_idx[3]; - u32 num_threads[3]; - u32 pipeline_stat_enable; - u32 perf_counter_enable; - u32 pgm[2]; - u32 tba[2]; - u32 tma[2]; - u32 pgm_rsrc[2]; - u32 vmid; - u32 resource_limits; - u32 static_thread_mgmt01[2]; - u32 tmp_ring_size; - u32 static_thread_mgmt23[2]; - u32 restart[3]; - u32 thread_trace_enable; - u32 reserved1; - u32 user_data[16]; - u32 vgtcs_invoke_count[2]; - struct hqd_registers queue_state; - u32 dequeue_cntr; - u32 interrupt_queue[64]; -}; - static void gfx_v7_0_compute_pipe_init(struct amdgpu_device *adev, int me, int pipe) { u64 eop_gpu_addr; u32 tmp; size_t eop_offset = me * pipe * GFX7_MEC_HPD_SIZE * 2; mutex_lock(&adev->srbm_mutex); eop_gpu_addr = adev->gfx.mec.hpd_eop_gpu_addr + eop_offset; cik_srbm_select(adev, me, pipe, 0, 0); /* write the EOP addr */ WREG32(mmCP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8); WREG32(mmCP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >> 8); /* set the VMID assigned */ WREG32(mmCP_HPD_EOP_VMID, 0); /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */ tmp = RREG32(mmCP_HPD_EOP_CONTROL); @@ -2980,182 +2953,182 @@ static int gfx_v7_0_mqd_deactivate(struct amdgpu_device *adev) if (RREG32(mmCP_HQD_ACTIVE) & 1) { WREG32(mmCP_HQD_DEQUEUE_REQUEST, 1); for (i = 0; i < adev->usec_timeout; i++) { if (!(RREG32(mmCP_HQD_ACTIVE) & 1)) break; udelay(1); } if (i == adev->usec_timeout) return -ETIMEDOUT; WREG32(mmCP_HQD_DEQUEUE_REQUEST, 0); WREG32(mmCP_HQD_PQ_RPTR, 0); WREG32(mmCP_HQD_PQ_WPTR, 0); } return 0; } static void gfx_v7_0_mqd_init(struct amdgpu_device *adev, -struct bonaire_mqd *mqd, +struct cik_mqd *mqd, uint64_t mqd_gpu_addr, struct amdgpu_ring *ring) { u64 hqd_gpu_addr; u64 wb_gpu_addr; /* init the mqd struct */ - memset(mqd, 0, sizeof(struct b
[PATCH split] Improve pipe split between amdgpu and amdkfd
This is a split of patches that are ready to land from the series: Add support for high priority scheduling in amdgpu v8 I've included Felix and Alex's feedback from the thread above. This includes: * Separate MEC_HPD_SIZE rename into a separate patch (patch 01) * Added a patch to fix the kgd_hqd_load bug Felix pointed out (patch 06) * Fixes for various off-by-one errors * Use gfx_v8_0_deactivate_hqd Only comment I didn't address was changing the queue allocation policy for gfx9 (similar to gfx7/8). See inline reply in that thread for more details on why this was skipped. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] dmr/amdgpu: Fix wrongly unref of BO
According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve can return with -ERESTARTSYS. When this function was interrupted by a signal, BO should not be unref. Otherwise the BO might be released while is kmapped and pinned, or BO MIGHT be deref multiple times, etc. Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621 Signed-off-by: Alex Xie --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 53996e3..1dcc2d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -355,8 +355,8 @@ static void amdgpu_vram_scratch_fini(struct amdgpu_device *adev) amdgpu_bo_kunmap(adev->vram_scratch.robj); amdgpu_bo_unpin(adev->vram_scratch.robj); amdgpu_bo_unreserve(adev->vram_scratch.robj); + amdgpu_bo_unref(&adev->vram_scratch.robj); } - amdgpu_bo_unref(&adev->vram_scratch.robj); } /** -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 01/17] drm/amdgpu: clarify MEC_HPD_SIZE is specific to a gfx generation
Rename MEC_HPD_SIZE to GFXN_MEC_HPD_SIZE to clarify it is specific to a gfx generation. Signed-off-by: Andres Rodriguez --- drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 11 +-- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 15 +++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 17 - 3 files changed, 20 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c index c930bb8..3b98162 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c @@ -32,40 +32,41 @@ #include "clearstate_ci.h" #include "dce/dce_8_0_d.h" #include "dce/dce_8_0_sh_mask.h" #include "bif/bif_4_1_d.h" #include "bif/bif_4_1_sh_mask.h" #include "gca/gfx_7_0_d.h" #include "gca/gfx_7_2_enum.h" #include "gca/gfx_7_2_sh_mask.h" #include "gmc/gmc_7_0_d.h" #include "gmc/gmc_7_0_sh_mask.h" #include "oss/oss_2_0_d.h" #include "oss/oss_2_0_sh_mask.h" #define GFX7_NUM_GFX_RINGS 1 #define GFX7_NUM_COMPUTE_RINGS 8 +#define GFX7_MEC_HPD_SIZE 2048 static void gfx_v7_0_set_ring_funcs(struct amdgpu_device *adev); static void gfx_v7_0_set_irq_funcs(struct amdgpu_device *adev); static void gfx_v7_0_set_gds_init(struct amdgpu_device *adev); MODULE_FIRMWARE("radeon/bonaire_pfp.bin"); MODULE_FIRMWARE("radeon/bonaire_me.bin"); MODULE_FIRMWARE("radeon/bonaire_ce.bin"); MODULE_FIRMWARE("radeon/bonaire_rlc.bin"); MODULE_FIRMWARE("radeon/bonaire_mec.bin"); MODULE_FIRMWARE("radeon/hawaii_pfp.bin"); MODULE_FIRMWARE("radeon/hawaii_me.bin"); MODULE_FIRMWARE("radeon/hawaii_ce.bin"); MODULE_FIRMWARE("radeon/hawaii_rlc.bin"); MODULE_FIRMWARE("radeon/hawaii_mec.bin"); MODULE_FIRMWARE("radeon/kaveri_pfp.bin"); MODULE_FIRMWARE("radeon/kaveri_me.bin"); MODULE_FIRMWARE("radeon/kaveri_ce.bin"); @@ -2804,90 +2805,88 @@ static void gfx_v7_0_cp_compute_fini(struct amdgpu_device *adev) } } } static void gfx_v7_0_mec_fini(struct amdgpu_device *adev) { int r; if (adev->gfx.mec.hpd_eop_obj) { r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, false); if (unlikely(r != 0)) dev_warn(adev->dev, "(%d) reserve HPD EOP bo failed\n", r); amdgpu_bo_unpin(adev->gfx.mec.hpd_eop_obj); amdgpu_bo_unreserve(adev->gfx.mec.hpd_eop_obj); amdgpu_bo_unref(&adev->gfx.mec.hpd_eop_obj); adev->gfx.mec.hpd_eop_obj = NULL; } } -#define MEC_HPD_SIZE 2048 - static int gfx_v7_0_mec_init(struct amdgpu_device *adev) { int r; u32 *hpd; /* * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total * Nonetheless, we assign only 1 pipe because all other pipes will * be handled by KFD */ adev->gfx.mec.num_mec = 1; adev->gfx.mec.num_pipe = 1; adev->gfx.mec.num_queue = adev->gfx.mec.num_mec * adev->gfx.mec.num_pipe * 8; if (adev->gfx.mec.hpd_eop_obj == NULL) { r = amdgpu_bo_create(adev, -adev->gfx.mec.num_mec *adev->gfx.mec.num_pipe * MEC_HPD_SIZE * 2, +adev->gfx.mec.num_mec * adev->gfx.mec.num_pipe * GFX7_MEC_HPD_SIZE * 2, PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL, &adev->gfx.mec.hpd_eop_obj); if (r) { dev_warn(adev->dev, "(%d) create HDP EOP bo failed\n", r); return r; } } r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, false); if (unlikely(r != 0)) { gfx_v7_0_mec_fini(adev); return r; } r = amdgpu_bo_pin(adev->gfx.mec.hpd_eop_obj, AMDGPU_GEM_DOMAIN_GTT, &adev->gfx.mec.hpd_eop_gpu_addr); if (r) { dev_warn(adev->dev, "(%d) pin HDP EOP bo failed\n", r); gfx_v7_0_mec_fini(adev); return r; } r = amdgpu_bo_kmap(adev->gfx.mec.hpd_eop_obj, (void **)&hpd); if (r) { dev_warn(adev->dev, "(%d) map HDP EOP bo failed\n", r); gfx_v7_0_mec_fini(adev); return r; } /* clear memory. Not sure if this is required or not */ - memset(hpd, 0, adev->gfx.mec.num_mec *adev->gfx.mec.num_pipe * MEC_HPD_SIZE * 2); + memset(hpd, 0, adev->gfx.mec.num_mec * adev->gfx.mec.num_pipe * GFX7_MEC_HPD_SIZE * 2); amdgpu_bo_kunmap(adev->gfx.mec.hpd_eop_obj); amdgpu_bo_unreserve(adev->gfx.mec.hpd_eop_obj); return 0; } struct hqd_registers { u32 cp_mqd_base_addr; u32 cp_mqd_base_addr_hi; u32 cp_hqd_active; u32 cp_hqd_vmid;
Re: amdgpu 0000:84:00.0: gpu post error! \\ Fatal error during GPU init
Hi! On Donnerstag, 13. April 2017 17:30:45 CEST Deucher, Alexander wrote: > > [ 17.692746] amdgpu :84:00.0: enabling device ( -> 0003) > > [ 17.692940] [drm] initializing kernel modesetting (TONGA 0x1002:0x6929 > > 0x1002:0x0334 0x00). > > [ 17.692963] [drm] register mmio base: 0xD010 > > [ 17.692964] [drm] register mmio size: 262144 > > [ 17.692970] [drm] doorbell mmio base: 0xF000 > > [ 17.692971] [drm] doorbell mmio size: 2097152 > > [ 17.692980] [drm] probing gen 2 caps for device 10b5:8747 = 8796103/10e > > [ 17.692981] [drm] probing mlw for device 10b5:8747 = 8796103 > > [ 17.692992] [drm] VCE enabled in physical mode > > [ 18.648132] ATOM BIOS: C76301 > > [ 18.651758] [drm] GPU posting now... > > [ 23.661513] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > > stuck in > > loop for more than 5secs aborting > > [ 23.673155] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > > stuck > > executing F250 (len 334, WS 4, PS 0) @ 0xF365 > > [ 23.685453] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > > stuck > > executing DB34 (len 324, WS 4, PS 0) @ 0xDC2C > > [ 23.697816] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > > stuck > > executing BCDE (len 254, WS 0, PS 4) @ 0xBDB4 > > [ 23.710137] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > > stuck > > executing B832 (len 143, WS 0, PS 8) @ 0xB8A9 > > [ 23.722451] amdgpu :84:00.0: gpu post error! > > [ 23.727950] amdgpu :84:00.0: Fatal error during GPU init > > Posting the GPU is failing. The is the initial basic asic setup that is > required before anything else can happen. There seem to be timeouts > waiting for some register states. Is there anything special about your > setup? Can you try a vanilla kernel? I don't think there is anything special. At least not that I am aware of. Dell R730xd with one AMD FirePro S7150X2 and 2 Mellanox ConnectX-4 Dual Port cards. Apart from the modifications shown in the commit log, I made no changes to the CoreOS Container Linux 1381 development version. The kernel is now unpatched, stock 4.10.9. Please find the logs of the unpatched / vanilla kernel attached. --Dennis[SOL Session operational. Use ~? for help] [c[Ãæþàæþàà à ààà à à[c[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01[2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01HKEY MAPPING FOR CONSOLE REDIRECTION: Use the <1> key sequence for Use the <2> key sequence for Use the <3> key sequence for Use the <0> key sequence for Use the key sequence for Use the <@> key sequence for Use the key sequence for Use the key sequence for Use the key sequence for Use the key sequence for Use the key sequence for , where x is any letter key, and X is the upper case of that key Use the key sequence for Press the spacebar to pause... [2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01HInitializing PCIe, USB, and Video... Done [2J[01;01H(B[?1;6;7l>[?25h[0;37;40m[2J[5;1H Press the spacebar to pause... KEY MAPPING FOR CONSOLE REDIRECTION: Use the <1> key sequence for Use the <2> key sequence for Use the <3> key sequence for Use the <0> key sequence for Use the key sequence for Use the <@> key sequence for Use the key sequence for Use the key sequence for Use the key sequence for Use the key sequence for Use the key sequence for , where x is any letter key, and X is the upper case of that key Use the key sequence for [0;37;40m[2J [0m[1;37;40m[1;1HF2 = System Setup[2;1HF10 = Lifecycle Controller (Config iDRAC, Update FW, Install OS)[3;1HF11 = Boot Manager[4;1HF12 = PXE Boot[0m[37;40m[7;1HBroadcom[7;10HNetXtreme[7;20HEthernet[7;29HBoot[7;34HAgent[8;1HCopyright[8;11H(C)[8;15H2000-2016[8;25HBroadcom[8;34HCorporation[9;1HAll[9;5Hrights[9;12Hreserved.[10;1HPress[10;7HCtrl-S[10;14Hto[10;17Henter[10;23HConfiguration[10;37HMenu[11;1H[?25h [12;1HInitializing[12;14HSerial[12;21HATA[12;25Hdevices...[13;1H[?25h [13;2HPort[13;7HJ:[13;10HPLDS[13;15HDVD+/-RW[13;24HDS-8ABSH[14;1H[?25h [15;1HPowerEdge[15;11HExpandable[15;22HRAID[15;27HController[15;38HBIOS[16;1HCopyright(c)[16;14H2015[16;19HAvago[16;25HTechnologies[17;1HPress[17;7H[17;17Hto[17;20HRun[17;24HConfiguration[17;38HUtility[18;1H[?25h [18;1HHA[18;4H-0[18;7H(Bus[18;12H3[18;14HDev[18;18H0)[18;21HPERC[18;26HH730[18;31HMini[19;1HFW[19;4Hpackage:[19;13H25.4.1.0004[20;1H[?25h [20;1H1[20;3HNon-RAID[20;12HDisk(s)[20;20Hfound[20;26Hon[20;29Hthe[20;33Hhost[20;38Hadapter[21;1H1[21;3HNon-RAID[21;12HDisk(s)[21;20Hhandled[21;28Hby[21;31HBIOS[23;1H0[23;3HVirtual[23;11HDrive(s)[23;20Hfound[23;26Hon[23;29Hthe[23;33Hhost[23;38Hadapter.[25;1H[?25h [24;1H0[24;3HVirtual[24;11HDrive(s)[24;20Hhandled[24;28Hby[24;31HBIOS[25;1H[?25h [24;1HFlexBoot[24;10Hv3.4.812[25;1HFlexBoot[25;10HPCI[25;14H86:
Re: [PATCH 3/3] drm/amdgpu: CIK support is no longer experimental
On Thu, Apr 13, 2017 at 6:41 PM, Nicolai Hähnle wrote: > On 11.04.2017 00:06, Felix Kuehling wrote: >> >> On 17-04-08 04:50 AM, Nicolai Hähnle wrote: >>> >>> On 07.04.2017 22:15, Felix Kuehling wrote: Change the wording of the CONFIG_DRM_AMDGPU_CIK option to indicate that it's no longer experimental. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/Kconfig | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig b/drivers/gpu/drm/amd/amdgpu/Kconfig index f3b6df8..029e3fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/Kconfig +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig @@ -9,11 +9,12 @@ config DRM_AMDGPU_CIK bool "Enable amdgpu support for CIK parts" depends on DRM_AMDGPU help - Choose this option if you want to enable experimental support - for CIK asics. + Choose this option if you want to enable support for CIK asics. - CIK is already supported in radeon. CIK support in amdgpu - is for experimentation and testing. + If you choose No here, CIK ASICs will be supported by the + radeon driver, as in previous kernel versions. Depending on + your choice you will need different user mode (Mesa, X.org) + drivers to support accelerated graphics on CIK. >>> >>> >>> The last part is a bit misleading: while you do need different DDXes, >>> the same Mesa driver (radeonsi) will work with both the radeon and the >>> amdgpu kernel module for CIK. FWIW, the same is true for SI, although >>> older versions of Mesa might stumble when run on the amdgpu kernel >>> module. >> >> >> I see. Do you know the minimum Mesa version required for SI and CIK >> support on amdgpu respectively? > > > For SI, it's Mesa 17.0. > > For CIK, I kind of suspect the support has "always" been there, since the > amdgpu kernel module was originally brought up on CIK, but maybe Marek knows > more. Yes, CIK Mesa support should work with all amdgpu versions. Marek ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/dp-helper: DP_TEST_MISC1 should be DP_TEST_MISC0
> -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Harry Wentland > Sent: Thursday, April 13, 2017 10:34 AM > To: amd-gfx@lists.freedesktop.org > Cc: Wentland, Harry > Subject: [PATCH] drm/dp-helper: DP_TEST_MISC1 should be > DP_TEST_MISC0 > > Bring this in line with spec and what commit in upstream drm tree. > > Signed-off-by: Harry Wentland I think you forgot to commit the relevant change on the DC side as this breaks the DC compile. Alex > --- > > This brings this definition in amd-staging-4.9 in line with upstream. > > include/drm/drm_dp_helper.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h > index 4b14a7674be1..d6a5015976d9 100644 > --- a/include/drm/drm_dp_helper.h > +++ b/include/drm/drm_dp_helper.h > @@ -419,7 +419,7 @@ > > #define DP_TEST_PATTERN 0x221 > > -#define DP_TEST_MISC1 0x232 > +#define DP_TEST_MISC0 0x232 > > #define DP_TEST_CRC_R_CR 0x240 > #define DP_TEST_CRC_G_Y 0x242 > -- > 2.11.0 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 3/3] drm/amdgpu: CIK support is no longer experimental
On 11.04.2017 00:06, Felix Kuehling wrote: On 17-04-08 04:50 AM, Nicolai Hähnle wrote: On 07.04.2017 22:15, Felix Kuehling wrote: Change the wording of the CONFIG_DRM_AMDGPU_CIK option to indicate that it's no longer experimental. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/Kconfig | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/Kconfig b/drivers/gpu/drm/amd/amdgpu/Kconfig index f3b6df8..029e3fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/Kconfig +++ b/drivers/gpu/drm/amd/amdgpu/Kconfig @@ -9,11 +9,12 @@ config DRM_AMDGPU_CIK bool "Enable amdgpu support for CIK parts" depends on DRM_AMDGPU help - Choose this option if you want to enable experimental support - for CIK asics. + Choose this option if you want to enable support for CIK asics. - CIK is already supported in radeon. CIK support in amdgpu - is for experimentation and testing. + If you choose No here, CIK ASICs will be supported by the + radeon driver, as in previous kernel versions. Depending on + your choice you will need different user mode (Mesa, X.org) + drivers to support accelerated graphics on CIK. The last part is a bit misleading: while you do need different DDXes, the same Mesa driver (radeonsi) will work with both the radeon and the amdgpu kernel module for CIK. FWIW, the same is true for SI, although older versions of Mesa might stumble when run on the amdgpu kernel module. I see. Do you know the minimum Mesa version required for SI and CIK support on amdgpu respectively? For SI, it's Mesa 17.0. For CIK, I kind of suspect the support has "always" been there, since the amdgpu kernel module was originally brought up on CIK, but maybe Marek knows more. Cheers, Nicolai Thanks, Felix Cheers, Nicolai config DRM_AMDGPU_USERPTR bool "Always enable userptr write support" -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: Add kernel parameter to manage memory error handling.
On 13/04/17 11:38 AM, Panariti, David wrote: + Vilas -Original Message- From: Deucher, Alexander Sent: Wednesday, April 12, 2017 9:29 PM To: 'Michel Dänzer' ; Panariti, David Cc: amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdgpu: Add kernel parameter to manage memory error handling. -Original Message- From: Michel Dänzer [mailto:mic...@daenzer.net] Sent: Wednesday, April 12, 2017 9:17 PM To: Panariti, David Cc: Deucher, Alexander; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdgpu: Add kernel parameter to manage memory error handling. On 13/04/17 02:38 AM, Panariti, David wrote: From: Michel Dänzer [mailto:mic...@daenzer.net] @@ -212,6 +213,9 @@ module_param_named(cg_mask, amdgpu_cg_mask, uint, 0444); MODULE_PARM_DESC(pg_mask, "Powergating flags mask (0 = disable power gating)"); module_param_named(pg_mask, amdgpu_pg_mask, uint, 0444); +MODULE_PARM_DESC(ecc_mask, "ECC/EDC flags mask (0 = disable +ECC/EDC)"); "0 = disable ECC/EDC" implies that they're enabled by default? Was that already the case before this patch? [davep] Yes it was, and there was actually a problem in some cases where the CZ would hang which is why I added the param. I was wondering if it would be better to default to them being off, but I wasn't sure how important maintaining original behavior is considered. Actually, there are some bugs in the workaround function as it is, so it really should default to off. I agree. There have been some bug reports about Carrizo hangs, I wonder if any of those might be related to this. Only the embedded SKUs support EDC. If they are embedded parts, it could be related. [davep] Sorry for the length, but I wanted all of the details out there for the most informed decision. Another thing is that they can go from not hanging to hanging for no discernable reason. The KIQ changes, however, have seemed to have fixed it. For one chip and a few tens of reboots. There is also the issue of improperly initialized *gpr registers. From the doc: "Due to a hardware condition whereby some shader instructions utilize uninitialized SGPRs and/or VGPRs, the S/VPGR memories must be initialized prior to EDC operation, as, not doing so will cause erroneous counts to show up in the EDC counters." I seem to recall Vilas saying it is the poison that isn't reset properly. But I'm not sure about the actual register contents. Vilas? I suggest, at a minimum, checking for cz *and* the EDC fuse. If so, explicitly disable EDC, run the shaders to zero the *gprs, leave EDC disabled, and merge in the existing new code to zero all of the counters. All of the code exists, in one place or another. The parameter to enable/disable probably won't be needed until EDC is fully implemented. However, EDC can be enabled in a way that simply allows it to count errors. This has never caused a hang for me. The counts are useful for reliability research. This was one of the goals of the original EDC task. A umr script could be written (by interested parties) to read the counters and in fact enable the EDC counters. I think this should be done if anyone is interested in the numbers. Vilas? Any R&R work left in this area? Do you think customers would be interested in doing this on their own? For the special keeners we could add EDC counters to umr's --top and then in theory it'll be included in the log output. If you can send me info on how to enable/read the counters I can take a look at it. Cheers, Tom ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: Add kernel parameter to manage memory error handling.
+ Vilas > -Original Message- > From: Deucher, Alexander > Sent: Wednesday, April 12, 2017 9:29 PM > To: 'Michel Dänzer' ; Panariti, David > > Cc: amd-gfx@lists.freedesktop.org > Subject: RE: [PATCH] drm/amdgpu: Add kernel parameter to manage > memory error handling. > > > -Original Message- > > From: Michel Dänzer [mailto:mic...@daenzer.net] > > Sent: Wednesday, April 12, 2017 9:17 PM > > To: Panariti, David > > Cc: Deucher, Alexander; amd-gfx@lists.freedesktop.org > > Subject: Re: [PATCH] drm/amdgpu: Add kernel parameter to manage > memory > > error handling. > > > > On 13/04/17 02:38 AM, Panariti, David wrote: > > >> From: Michel Dänzer [mailto:mic...@daenzer.net] > > >> > > >>> @@ -212,6 +213,9 @@ module_param_named(cg_mask, > > >> amdgpu_cg_mask, uint, > > >>> 0444); MODULE_PARM_DESC(pg_mask, "Powergating flags mask (0 = > > >> disable > > >>> power gating)"); module_param_named(pg_mask, > amdgpu_pg_mask, > > >> uint, > > >>> 0444); > > >>> > > >>> +MODULE_PARM_DESC(ecc_mask, "ECC/EDC flags mask (0 = disable > > >>> +ECC/EDC)"); > > >> > > >> "0 = disable ECC/EDC" implies that they're enabled by default? Was > > >> that already the case before this patch? > > > > > > [davep] Yes it was, and there was actually a problem in some cases > > > where the CZ would hang which is why I added the param. I was > > > wondering if it would be better to default to them being off, but I > > > wasn't sure how important maintaining original behavior is > > > considered. Actually, there are some bugs in the workaround function > > > as it is, so it really should default to off. > > > > I agree. There have been some bug reports about Carrizo hangs, I > > wonder if any of those might be related to this. > > Only the embedded SKUs support EDC. If they are embedded parts, it could > be related. [davep] Sorry for the length, but I wanted all of the details out there for the most informed decision. Another thing is that they can go from not hanging to hanging for no discernable reason. The KIQ changes, however, have seemed to have fixed it. For one chip and a few tens of reboots. There is also the issue of improperly initialized *gpr registers. From the doc: "Due to a hardware condition whereby some shader instructions utilize uninitialized SGPRs and/or VGPRs, the S/VPGR memories must be initialized prior to EDC operation, as, not doing so will cause erroneous counts to show up in the EDC counters." I seem to recall Vilas saying it is the poison that isn't reset properly. But I'm not sure about the actual register contents. Vilas? I suggest, at a minimum, checking for cz *and* the EDC fuse. If so, explicitly disable EDC, run the shaders to zero the *gprs, leave EDC disabled, and merge in the existing new code to zero all of the counters. All of the code exists, in one place or another. The parameter to enable/disable probably won't be needed until EDC is fully implemented. However, EDC can be enabled in a way that simply allows it to count errors. This has never caused a hang for me. The counts are useful for reliability research. This was one of the goals of the original EDC task. A umr script could be written (by interested parties) to read the counters and in fact enable the EDC counters. I think this should be done if anyone is interested in the numbers. Vilas? Any R&R work left in this area? Do you think customers would be interested in doing this on their own? davep > > Alex ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: amdgpu 0000:84:00.0: gpu post error! \\ Fatal error during GPU init
> -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Dennis Schridde > Sent: Thursday, April 13, 2017 10:18 AM > To: amd-gfx@lists.freedesktop.org > Subject: amdgpu :84:00.0: gpu post error! \\ Fatal error during GPU init > > Hello! > > I am trying to use an AMD FirePro S7150X2 with the AMDGPU driver of a > Linux > 4.10.9 kernel (CoreOS Container Linux) and linux-firmware > e39f0e3e6897ad865b3704f61218ae83f98a85da, but I run into the following > error > after the amdgpu module is being loaded: > > [ 17.692746] amdgpu :84:00.0: enabling device ( -> 0003) > [ 17.692940] [drm] initializing kernel modesetting (TONGA 0x1002:0x6929 > 0x1002:0x0334 0x00). > [ 17.692963] [drm] register mmio base: 0xD010 > [ 17.692964] [drm] register mmio size: 262144 > [ 17.692970] [drm] doorbell mmio base: 0xF000 > [ 17.692971] [drm] doorbell mmio size: 2097152 > [ 17.692980] [drm] probing gen 2 caps for device 10b5:8747 = 8796103/10e > [ 17.692981] [drm] probing mlw for device 10b5:8747 = 8796103 > [ 17.692992] [drm] VCE enabled in physical mode > [ 18.648132] ATOM BIOS: C76301 > [ 18.651758] [drm] GPU posting now... > [ 23.661513] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > stuck in > loop for more than 5secs aborting > [ 23.673155] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > stuck > executing F250 (len 334, WS 4, PS 0) @ 0xF365 > [ 23.685453] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > stuck > executing DB34 (len 324, WS 4, PS 0) @ 0xDC2C > [ 23.697816] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > stuck > executing BCDE (len 254, WS 0, PS 4) @ 0xBDB4 > [ 23.710137] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios > stuck > executing B832 (len 143, WS 0, PS 8) @ 0xB8A9 > [ 23.722451] amdgpu :84:00.0: gpu post error! > [ 23.727950] amdgpu :84:00.0: Fatal error during GPU init Posting the GPU is failing. The is the initial basic asic setup that is required before anything else can happen. There seem to be timeouts waiting for some register states. Is there anything special about your setup? Can you try a vanilla kernel? Alex > [ 23.734594] [drm] amdgpu: finishing device. > [ 23.739592] [ cut here ] > ... > [ 24.096608] ---[ end trace 88c8cb35b32e3b88 ]--- > [ 24.102086] BUG: unable to handle kernel NULL pointer dereference at > 0018 > [ 24.111438] IP: __ww_mutex_lock+0x24/0xa0 > [ 24.116222] PGD 0 > [ 24.116223] > [ 24.120737] Oops: 0002 [#1] SMP > ... > > Please find a full log attached. > > My kernel configuration is available at: > https://github.com/urzds/coreos-overlay/blob/hpc_support/sys- > kernel/coreos-modules/files/{commonconfig-4.10,amd64_defconfig-4.10} > Please refer to the the commit log of the "hpc_support" branch for my > changes > compared to the CoreOS CL stock config. > > I would be very glad if you could help me in debugging the issue and getting > the GPU running. > > Thanks, > Dennis ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: AMDGPU without display output
Thanks, Alex! signature.asc Description: This is a digitally signed message part. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: AMDGPU without display output
> -Original Message- > From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf > Of Dennis Schridde > Sent: Thursday, April 13, 2017 10:32 AM > To: amd-gfx@lists.freedesktop.org > Subject: AMDGPU without display output > > Hello again! > > I am trying to use a AMD FirePro S7150X2 with the AMDGPU driver of a Linux > 4.10.9 kernel (CoreOS Container Linux) and linux-firmware > e39f0e3e6897ad865b3704f61218ae83f98a85da. > > Since the card has no display output and I want to run remote applications > only, I would like to prevent any interference with mode setting and the > kernel console. Thus I set "nomodeset" on the kernel command line to > prevent > the kernel from trying to initialise anything but the rendering functions of > the card. However, this leads to following error message: > > [drm:init_module [amdgpu]] *ERROR* VGACON disables amdgpu kernel > modesetting. > > The result is that the AMDGPU module can not be loaded. nomodeset prevents the driver from loading. It's a way to disable the KMS altogether. > > Is this generally the right approach to use this driver for rendering without > display output, or can I safely leave KMS enabled and it will not interfere > with my application's X servers and the OpenGL applications running on > them? > The driver only exposes display connectors when they exist. > Assuming I have to disable KMS, how would I get past this error, i.e. to > initialise the card's rendering functions, but skipping initialisation of the > output part of the driver? > You don't have disable KMS. there is no way to. The driver will expose the hw that exists on the card. If there are not display connectors, none will be exposed. It's up to the user to configure X to use or not use specific cards. > Generally asking: How do people use this card for GPGPU compute (i.e. > headless) tasks? Is there some documentation what I need to pay attention > to? It should just work as long as the driver is loaded. Alex ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
AMDGPU without display output
Hello again! I am trying to use a AMD FirePro S7150X2 with the AMDGPU driver of a Linux 4.10.9 kernel (CoreOS Container Linux) and linux-firmware e39f0e3e6897ad865b3704f61218ae83f98a85da. Since the card has no display output and I want to run remote applications only, I would like to prevent any interference with mode setting and the kernel console. Thus I set "nomodeset" on the kernel command line to prevent the kernel from trying to initialise anything but the rendering functions of the card. However, this leads to following error message: [drm:init_module [amdgpu]] *ERROR* VGACON disables amdgpu kernel modesetting. The result is that the AMDGPU module can not be loaded. Is this generally the right approach to use this driver for rendering without display output, or can I safely leave KMS enabled and it will not interfere with my application's X servers and the OpenGL applications running on them? Assuming I have to disable KMS, how would I get past this error, i.e. to initialise the card's rendering functions, but skipping initialisation of the output part of the driver? Generally asking: How do people use this card for GPGPU compute (i.e. headless) tasks? Is there some documentation what I need to pay attention to? Thanks, Dennis signature.asc Description: This is a digitally signed message part. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amdgpu 0000:84:00.0: gpu post error! \\ Fatal error during GPU init
Hello! I am trying to use an AMD FirePro S7150X2 with the AMDGPU driver of a Linux 4.10.9 kernel (CoreOS Container Linux) and linux-firmware e39f0e3e6897ad865b3704f61218ae83f98a85da, but I run into the following error after the amdgpu module is being loaded: [ 17.692746] amdgpu :84:00.0: enabling device ( -> 0003) [ 17.692940] [drm] initializing kernel modesetting (TONGA 0x1002:0x6929 0x1002:0x0334 0x00). [ 17.692963] [drm] register mmio base: 0xD010 [ 17.692964] [drm] register mmio size: 262144 [ 17.692970] [drm] doorbell mmio base: 0xF000 [ 17.692971] [drm] doorbell mmio size: 2097152 [ 17.692980] [drm] probing gen 2 caps for device 10b5:8747 = 8796103/10e [ 17.692981] [drm] probing mlw for device 10b5:8747 = 8796103 [ 17.692992] [drm] VCE enabled in physical mode [ 18.648132] ATOM BIOS: C76301 [ 18.651758] [drm] GPU posting now... [ 23.661513] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios stuck in loop for more than 5secs aborting [ 23.673155] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios stuck executing F250 (len 334, WS 4, PS 0) @ 0xF365 [ 23.685453] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios stuck executing DB34 (len 324, WS 4, PS 0) @ 0xDC2C [ 23.697816] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios stuck executing BCDE (len 254, WS 0, PS 4) @ 0xBDB4 [ 23.710137] [drm:amdgpu_connector_add [amdgpu]] *ERROR* atombios stuck executing B832 (len 143, WS 0, PS 8) @ 0xB8A9 [ 23.722451] amdgpu :84:00.0: gpu post error! [ 23.727950] amdgpu :84:00.0: Fatal error during GPU init [ 23.734594] [drm] amdgpu: finishing device. [ 23.739592] [ cut here ] ... [ 24.096608] ---[ end trace 88c8cb35b32e3b88 ]--- [ 24.102086] BUG: unable to handle kernel NULL pointer dereference at 0018 [ 24.111438] IP: __ww_mutex_lock+0x24/0xa0 [ 24.116222] PGD 0 [ 24.116223] [ 24.120737] Oops: 0002 [#1] SMP ... Please find a full log attached. My kernel configuration is available at: https://github.com/urzds/coreos-overlay/blob/hpc_support/sys-kernel/coreos-modules/files/{commonconfig-4.10,amd64_defconfig-4.10} Please refer to the the commit log of the "hpc_support" branch for my changes compared to the CoreOS CL stock config. I would be very glad if you could help me in debugging the issue and getting the GPU running. Thanks, Dennis[SOL Session operational. Use ~? for help] Lifecycle Controller: Done Booting... [2J[01;01H[2J[01;01H[0;37;40m[2J[H[0;37;40m[2J[H(B[?1;6;7l>[?25h[0;37;40m[2J[5;1H[0;37;40m[2J [2;1HBooting[2;9Hfrom[2;14HBRCM[2;19HMBA[2;23HSlot[2;28H0100[2;33Hv20.2.0[4;1HBroadcom[4;10HUNDI[4;15HPXE-2.1[4;23Hv20.2.0[5;1HCopyright[5;11H(C)[5;15H2000-2016[5;25HBroadcom[5;34HCorporation[6;1HCopyright[6;11H(C)[6;15H1997-2000[6;25HIntel[6;31HCorporation[7;1HAll[7;5Hrights[7;12Hreserved.[8;1H[?25h [9;1HCLIENT[9;8HMAC[9;12HADDR:[9;18H18[9;21H66[9;24HDA[9;27HF0[9;30H43[9;33H18[9;37HGUID:[9;43H4C4C4544-004C-5610-804C-B8C04F574732[10;1HDHCP.[?25h[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h[10;6H[10;6H\[10;6H[?25h[10;6H[10;6H|[10;6H[?25h[10;6H[10;6H/[10;6H[?25h[10;6H[10;6H-[10;6H[?25h [10;1HCLIENT[10;8HIP:[10;12H192.168.10.73[10;27HMASK:[10;33H255.255.255.0[10;48HDHCP[10;53HIP:[10;57H192.168.10.27[11;1HGATEWAY[11;9HIP:[11;13H192.168.10.27[12;1HTFTP.[?25h [12;1HPXE->EB:[12;10H!PXE[12;15Hat[12;18H95E1:0040,[12;29Hentry[12;35Hpoint[12;41Hat[12;44H95E1:00D6[13;10HUNDI[13;15Hcode[13;20Hsegment[13;28H95E1:6B70,[13;39Hdata[13;44Hsegment[13;52H922A:3B70[13;62H(584-627kB)[14;10HUNDI[14;15Hdevice[14;22His[14;25HPCI[14;29H01:00.0,[14;38Htype[14;43HDIX+802.3[15;1H[?25h [15;10H584kB[15;16Hfree[15;21Hbase[15;26Hmemory[15;33Hafter[15;39HPXE[15;43Hunload[16;1HiPXE[16;6Hinitialising[16;19Hdevices...[?25h [16;29Hok[0m[1;37;40m[20;1HiPXE 1.0.0+ (6c748)[0m[37;40m[20;21H--[20;24HOpen[20;29HSource[20;36HNetwork
[PATCH libdrm 0/2] amdgpu: add amdgpu_cs_wait_fences
Hi all, These changes expose a function to call the WAIT_FENCES ioctl for waiting on multiple fences at the same time. This is useful for Vulkan. They are mostly changes that have been in the amdgpu-pro libdrm for a long time. I've taken the liberty to clean them up a bit and add some missing bits. Please review! Thanks, Nicolai ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH libdrm 1/2] amdgpu: add the interface of waiting multiple fences
From: Nicolai Hähnle Signed-off-by: Junwei Zhang [v2: allow returning the first signaled fence index] Signed-off-by: monk.liu [v3: - cleanup *status setting - fix amdgpu symbols check] Signed-off-by: Nicolai Hähnle Reviewed-by: Christian König (v1) Reviewed-by: Jammy Zhou (v1) --- amdgpu/amdgpu-symbol-check | 1 + amdgpu/amdgpu.h| 23 ++ amdgpu/amdgpu_cs.c | 74 ++ 3 files changed, 98 insertions(+) diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-check index 4d1ae65..81ef9b4 100755 --- a/amdgpu/amdgpu-symbol-check +++ b/amdgpu/amdgpu-symbol-check @@ -26,20 +26,21 @@ amdgpu_bo_va_op_raw amdgpu_bo_wait_for_idle amdgpu_create_bo_from_user_mem amdgpu_cs_create_semaphore amdgpu_cs_ctx_create amdgpu_cs_ctx_free amdgpu_cs_destroy_semaphore amdgpu_cs_query_fence_status amdgpu_cs_query_reset_state amdgpu_cs_signal_semaphore amdgpu_cs_submit +amdgpu_cs_wait_fences amdgpu_cs_wait_semaphore amdgpu_device_deinitialize amdgpu_device_initialize amdgpu_get_marketing_name amdgpu_query_buffer_size_alignment amdgpu_query_crtc_from_id amdgpu_query_firmware_version amdgpu_query_gds_info amdgpu_query_gpu_info amdgpu_query_heap_info diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h index 55884b2..fdea905 100644 --- a/amdgpu/amdgpu.h +++ b/amdgpu/amdgpu.h @@ -900,20 +900,43 @@ int amdgpu_cs_submit(amdgpu_context_handle context, * returned in the case if submission was completed or timeout error * code. * * \sa amdgpu_cs_submit() */ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence *fence, uint64_t timeout_ns, uint64_t flags, uint32_t *expired); +/** + * Wait for multiple fences + * + * \param fences - \c [in] The fence array to wait + * \param fence_count - \c [in] The fence count + * \param wait_all- \c [in] If true, wait all fences to be signaled, + *otherwise, wait at least one fence + * \param timeout_ns - \c [in] The timeout to wait, in nanoseconds + * \param status - \c [out] '1' for signaled, '0' for timeout + * \param first - \c [out] the index of the first signaled fence from @fences + * + * \return 0 on success + * <0 - Negative POSIX Error code + * + * \noteCurrently it supports only one amdgpu_device. All fences come from + * the same amdgpu_device with the same fd. +*/ +int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences, + uint32_t fence_count, + bool wait_all, + uint64_t timeout_ns, + uint32_t *status, uint32_t *first); + /* * Query / Info API * */ /** * Query allocation size alignments * * UMD should query information about GPU VM MC size alignments requirements * to be able correctly choose required allocation size and implement diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c index fb5b3a8..707e6d1 100644 --- a/amdgpu/amdgpu_cs.c +++ b/amdgpu/amdgpu_cs.c @@ -436,20 +436,94 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence *fence, r = amdgpu_ioctl_wait_cs(fence->context, fence->ip_type, fence->ip_instance, fence->ring, fence->fence, timeout_ns, flags, &busy); if (!r && !busy) *expired = true; return r; } +static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence *fences, + uint32_t fence_count, + bool wait_all, + uint64_t timeout_ns, + uint32_t *status, + uint32_t *first) +{ + struct drm_amdgpu_fence *drm_fences; + amdgpu_device_handle dev = fences[0].context->dev; + union drm_amdgpu_wait_fences args; + int r; + uint32_t i; + + drm_fences = alloca(sizeof(struct drm_amdgpu_fence) * fence_count); + for (i = 0; i < fence_count; i++) { + drm_fences[i].ctx_id = fences[i].context->id; + drm_fences[i].ip_type = fences[i].ip_type; + drm_fences[i].ip_instance = fences[i].ip_instance; + drm_fences[i].ring = fences[i].ring; + drm_fences[i].seq_no = fences[i].fence; + } + + memset(&args, 0, sizeof(args)); + args.in.fences = (uint64_t)(uintptr_t)drm_fences; + args.in.fence_count = fence_count; + args.in.wait_all = wait_all; + args.in.timeout_ns = amdgpu_cs_calculate_timeout(timeout_ns); + + r = drmIoctl(dev->fd, DRM_IOCTL_AMDGPU_WAIT_FENCES, &args); + if (r) + return -errno; + + *status = args.out.status; + + if (first) + *first = args.out.first_signaled; + + return 0; +} + +int amdgpu_cs
[PATCH libdrm 2/2] amdgpu: add a test for amdgpu_cs_wait_fences
From: Nicolai Hähnle Signed-off-by: monk.liu [v2: actually hook up the test case] Signed-off-by: Nicolai Hähnle --- tests/amdgpu/basic_tests.c | 100 + 1 file changed, 100 insertions(+) diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c index 4dce67e..8d5844b 100644 --- a/tests/amdgpu/basic_tests.c +++ b/tests/amdgpu/basic_tests.c @@ -38,34 +38,36 @@ #include "amdgpu_drm.h" static amdgpu_device_handle device_handle; static uint32_t major_version; static uint32_t minor_version; static void amdgpu_query_info_test(void); static void amdgpu_memory_alloc(void); static void amdgpu_command_submission_gfx(void); static void amdgpu_command_submission_compute(void); +static void amdgpu_command_submission_multi_fence(void); static void amdgpu_command_submission_sdma(void); static void amdgpu_userptr_test(void); static void amdgpu_semaphore_test(void); static void amdgpu_command_submission_write_linear_helper(unsigned ip_type); static void amdgpu_command_submission_const_fill_helper(unsigned ip_type); static void amdgpu_command_submission_copy_linear_helper(unsigned ip_type); CU_TestInfo basic_tests[] = { { "Query Info Test", amdgpu_query_info_test }, { "Memory alloc Test", amdgpu_memory_alloc }, { "Userptr Test", amdgpu_userptr_test }, { "Command submission Test (GFX)", amdgpu_command_submission_gfx }, { "Command submission Test (Compute)", amdgpu_command_submission_compute }, + { "Command submission Test (Multi-Fence)", amdgpu_command_submission_multi_fence }, { "Command submission Test (SDMA)", amdgpu_command_submission_sdma }, { "SW semaphore Test", amdgpu_semaphore_test }, CU_TEST_INFO_NULL, }; #define BUFFER_SIZE (8 * 1024) #define SDMA_PKT_HEADER_op_offset 0 #define SDMA_PKT_HEADER_op_mask 0x00FF #define SDMA_PKT_HEADER_op_shift 0 #define SDMA_PKT_HEADER_OP(x) (((x) & SDMA_PKT_HEADER_op_mask) << SDMA_PKT_HEADER_op_shift) #define SDMA_OPCODE_CONSTANT_FILL 11 @@ -1142,20 +1144,118 @@ static void amdgpu_command_submission_sdma_copy_linear(void) amdgpu_command_submission_copy_linear_helper(AMDGPU_HW_IP_DMA); } static void amdgpu_command_submission_sdma(void) { amdgpu_command_submission_sdma_write_linear(); amdgpu_command_submission_sdma_const_fill(); amdgpu_command_submission_sdma_copy_linear(); } +static void amdgpu_command_submission_multi_fence_wait_all(bool wait_all) +{ + amdgpu_context_handle context_handle; + amdgpu_bo_handle ib_result_handle, ib_result_ce_handle; + void *ib_result_cpu, *ib_result_ce_cpu; + uint64_t ib_result_mc_address, ib_result_ce_mc_address; + struct amdgpu_cs_request ibs_request[2] = {0}; + struct amdgpu_cs_ib_info ib_info[2]; + struct amdgpu_cs_fence fence_status[2] = {0}; + uint32_t *ptr; + uint32_t expired; + amdgpu_bo_list_handle bo_list; + amdgpu_va_handle va_handle, va_handle_ce; + int r; + int i, ib_cs_num = 2; + + r = amdgpu_cs_ctx_create(device_handle, &context_handle); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096, + AMDGPU_GEM_DOMAIN_GTT, 0, + &ib_result_handle, &ib_result_cpu, + &ib_result_mc_address, &va_handle); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096, + AMDGPU_GEM_DOMAIN_GTT, 0, + &ib_result_ce_handle, &ib_result_ce_cpu, + &ib_result_ce_mc_address, &va_handle_ce); + CU_ASSERT_EQUAL(r, 0); + + r = amdgpu_get_bo_list(device_handle, ib_result_handle, + ib_result_ce_handle, &bo_list); + CU_ASSERT_EQUAL(r, 0); + + memset(ib_info, 0, 2 * sizeof(struct amdgpu_cs_ib_info)); + + /* IT_SET_CE_DE_COUNTERS */ + ptr = ib_result_ce_cpu; + ptr[0] = 0xc0008900; + ptr[1] = 0; + ptr[2] = 0xc0008400; + ptr[3] = 1; + ib_info[0].ib_mc_address = ib_result_ce_mc_address; + ib_info[0].size = 4; + ib_info[0].flags = AMDGPU_IB_FLAG_CE; + + /* IT_WAIT_ON_CE_COUNTER */ + ptr = ib_result_cpu; + ptr[0] = 0xc0008600; + ptr[1] = 0x0001; + ib_info[1].ib_mc_address = ib_result_mc_address; + ib_info[1].size = 2; + + for (i = 0; i < ib_cs_num; i++) { + ibs_request[i].ip_type = AMDGPU_HW_IP_GFX; + ibs_request[i].number_of_ibs = 2; + ibs_request[i].ibs = ib_info; + ibs_request[i].resources = bo_list; + ibs_request[i].fence_info.handle = NULL; + } + + r = amdgpu_cs_submit(context_handle, 0,ibs_request, ib_cs_num); + + CU_ASSERT_EQUAL(r, 0); + + for (i = 0; i
Re: [PATCH] drm/dp-helper: DP_TEST_MISC1 should be DP_TEST_MISC0
On Thu, Apr 13, 2017 at 10:34 AM, Harry Wentland wrote: > Bring this in line with spec and what commit in upstream drm tree. > > Signed-off-by: Harry Wentland Acked-by: Alex Deucher > --- > > This brings this definition in amd-staging-4.9 in line with upstream. > > include/drm/drm_dp_helper.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h > index 4b14a7674be1..d6a5015976d9 100644 > --- a/include/drm/drm_dp_helper.h > +++ b/include/drm/drm_dp_helper.h > @@ -419,7 +419,7 @@ > > #define DP_TEST_PATTERN0x221 > > -#define DP_TEST_MISC1 0x232 > +#define DP_TEST_MISC0 0x232 > > #define DP_TEST_CRC_R_CR 0x240 > #define DP_TEST_CRC_G_Y0x242 > -- > 2.11.0 > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/dp-helper: DP_TEST_MISC1 should be DP_TEST_MISC0
Bring this in line with spec and what commit in upstream drm tree. Signed-off-by: Harry Wentland --- This brings this definition in amd-staging-4.9 in line with upstream. include/drm/drm_dp_helper.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index 4b14a7674be1..d6a5015976d9 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -419,7 +419,7 @@ #define DP_TEST_PATTERN0x221 -#define DP_TEST_MISC1 0x232 +#define DP_TEST_MISC0 0x232 #define DP_TEST_CRC_R_CR 0x240 #define DP_TEST_CRC_G_Y0x242 -- 2.11.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] drm/amdgpu: fix dead lock if any ip block resume failed in s3
> -Original Message- > From: Huang Rui [mailto:ray.hu...@amd.com] > Sent: Thursday, April 13, 2017 4:12 AM > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > Cc: Koenig, Christian; Wang, Ken; Huang, Ray > Subject: [PATCH] drm/amdgpu: fix dead lock if any ip block resume failed in > s3 > > Driver must free the console lock whether driver resuming successful > or not. Otherwise, fb_console will be always waiting for the lock and > then cause system stuck. > > [ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds. > [ 244.405543] Tainted: G OE 4.9.0-custom #1 > [ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds. > [ 244.405543] Tainted: G OE 4.9.0-custom #1 > [ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 244.405550] kworker/0:0 D0 4 2 0x0008 > [ 244.405559] Workqueue: events console_callback > [ 244.405564] 88045a2cfc00 880462b75940 > 81c0e500 > [ 244.405568] 880476419280 c900018f7c90 817dcf62 > 003c > [ 244.405572] 0001 0002 880462b75940 > 880462b75940 > [ 244.405573] Call Trace: > [ 244.405580] [] ? __schedule+0x222/0x6a0 > [ 244.405584] [] schedule+0x36/0x80 > [ 244.405588] [] schedule_timeout+0x1fc/0x390 > [ 244.405592] [] __down_common+0xa5/0xf8 > [ 244.405598] [] ? put_prev_entity+0x48/0x710 > [ 244.405601] [] __down+0x1d/0x1f > [ 244.405606] [] down+0x41/0x50 > [ 244.405611] [] console_lock+0x1a/0x40 > [ 244.405614] [] console_callback+0x13/0x160 > [ 244.405617] [] ? __schedule+0x22a/0x6a0 > [ 244.405623] [] process_one_work+0x153/0x3f0 > [ 244.405628] [] worker_thread+0x12b/0x4b0 > [ 244.405633] [] ? rescuer_thread+0x350/0x350 > [ 244.405637] [] kthread+0xd3/0xf0 > [ 244.405641] [] ? kthread_park+0x60/0x60 > [ 244.405645] [] ? kthread_park+0x60/0x60 > [ 244.405649] [] ret_from_fork+0x25/0x30 > > Signed-off-by: Huang Rui Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 28 --- > - > 1 file changed, 12 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index bd3a0d5..abb4dcc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2280,7 +2280,7 @@ int amdgpu_device_resume(struct drm_device > *dev, bool resume, bool fbcon) > struct drm_connector *connector; > struct amdgpu_device *adev = dev->dev_private; > struct drm_crtc *crtc; > - int r; > + int r = 0; > > if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) > return 0; > @@ -2292,11 +2292,8 @@ int amdgpu_device_resume(struct drm_device > *dev, bool resume, bool fbcon) > pci_set_power_state(dev->pdev, PCI_D0); > pci_restore_state(dev->pdev); > r = pci_enable_device(dev->pdev); > - if (r) { > - if (fbcon) > - console_unlock(); > - return r; > - } > + if (r) > + goto unlock; > } > if (adev->is_atom_fw) > amdgpu_atomfirmware_scratch_regs_restore(adev); > @@ -2313,7 +2310,7 @@ int amdgpu_device_resume(struct drm_device > *dev, bool resume, bool fbcon) > r = amdgpu_resume(adev); > if (r) { > DRM_ERROR("amdgpu_resume failed (%d).\n", r); > - return r; > + goto unlock; > } > amdgpu_fence_driver_resume(adev); > > @@ -2324,11 +2321,8 @@ int amdgpu_device_resume(struct drm_device > *dev, bool resume, bool fbcon) > } > > r = amdgpu_late_init(adev); > - if (r) { > - if (fbcon) > - console_unlock(); > - return r; > - } > + if (r) > + goto unlock; > > /* pin cursors */ > list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) { > @@ -2349,7 +2343,7 @@ int amdgpu_device_resume(struct drm_device > *dev, bool resume, bool fbcon) > } > r = amdgpu_amdkfd_resume(adev); > if (r) > - return r; > + goto unlock; > > /* blat the mode back in */ > if (fbcon) { > @@ -2396,12 +2390,14 @@ int amdgpu_device_resume(struct drm_device > *dev, bool resume, bool fbcon) > dev->dev->power.disable_depth--; > #endif > > - if (fbcon) { > + if (fbcon) > amdgpu_fbdev_set_suspend(adev, 0); > + > +unlock: > + if (fbcon) > console_unlock(); > - } > > - return 0; > + return r; > } > > static bool amdgpu_check_soft_reset(struct amdgpu_device *adev) > -- > 2.7.4 __
Re: [PATCH] drm/amdgpu: fix dead lock if any ip block resume failed in s3
On 13/04/17 05:12 PM, Huang Rui wrote: > Driver must free the console lock whether driver resuming successful > or not. Otherwise, fb_console will be always waiting for the lock and > then cause system stuck. > > [ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds. > [ 244.405543] Tainted: G OE 4.9.0-custom #1 > [ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds. > [ 244.405543] Tainted: G OE 4.9.0-custom #1 > [ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 244.405550] kworker/0:0 D0 4 2 0x0008 > [ 244.405559] Workqueue: events console_callback > [ 244.405564] 88045a2cfc00 880462b75940 > 81c0e500 > [ 244.405568] 880476419280 c900018f7c90 817dcf62 > 003c > [ 244.405572] 0001 0002 880462b75940 > 880462b75940 > [ 244.405573] Call Trace: > [ 244.405580] [] ? __schedule+0x222/0x6a0 > [ 244.405584] [] schedule+0x36/0x80 > [ 244.405588] [] schedule_timeout+0x1fc/0x390 > [ 244.405592] [] __down_common+0xa5/0xf8 > [ 244.405598] [] ? put_prev_entity+0x48/0x710 > [ 244.405601] [] __down+0x1d/0x1f > [ 244.405606] [] down+0x41/0x50 > [ 244.405611] [] console_lock+0x1a/0x40 > [ 244.405614] [] console_callback+0x13/0x160 > [ 244.405617] [] ? __schedule+0x22a/0x6a0 > [ 244.405623] [] process_one_work+0x153/0x3f0 > [ 244.405628] [] worker_thread+0x12b/0x4b0 > [ 244.405633] [] ? rescuer_thread+0x350/0x350 > [ 244.405637] [] kthread+0xd3/0xf0 > [ 244.405641] [] ? kthread_park+0x60/0x60 > [ 244.405645] [] ? kthread_park+0x60/0x60 > [ 244.405649] [] ret_from_fork+0x25/0x30 > > Signed-off-by: Huang Rui Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 2/3] drm/amdgpu: add gtt print like vram when dump mm table
Change-Id: If0474e24e14d237d2d55731871c5ceb11e5a3601 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 2 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c index 8a950a5..4bc1dd6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c @@ -138,6 +138,12 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man, return r; } +void amdgpu_gtt_mgr_print(struct seq_file *m, struct ttm_mem_type_manager *man) +{ + struct amdgpu_gtt_mgr *mgr = man->priv; + seq_printf(m, "man size:%llu pages, gtt available:%llu pages\n", + man->size, mgr->available); +} /** * amdgpu_gtt_mgr_new - allocate a new node * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index c3112b6..688056e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1540,6 +1540,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo, #if defined(CONFIG_DEBUG_FS) +extern void amdgpu_gtt_mgr_print(struct seq_file *m, struct ttm_mem_type_manager +*man); static int amdgpu_mm_dump_table(struct seq_file *m, void *data) { struct drm_info_node *node = (struct drm_info_node *)m->private; @@ -1558,6 +1560,8 @@ static int amdgpu_mm_dump_table(struct seq_file *m, void *data) adev->mman.bdev.man[ttm_pl].size, (u64)atomic64_read(&adev->vram_usage) >> 20, (u64)atomic64_read(&adev->vram_vis_usage) >> 20); + if (ttm_pl == TTM_PL_TT) + amdgpu_gtt_mgr_print(m, &adev->mman.bdev.man[ttm_pl]); return ret; } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 3/3] drm/amdgpu: move gtt usage statistic to gtt mgr
Change-Id: Ifea42c8ae2206143d7e22b35eea537ba9e928fe8 Signed-off-by: Chunming Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 13 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 -- 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c index 4bc1dd6..4b282ec 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c @@ -97,6 +97,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man, { struct amdgpu_gtt_mgr *mgr = man->priv; struct drm_mm_node *node = mem->mm_node; + struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev); enum drm_mm_search_flags sflags = DRM_MM_SEARCH_BEST; enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT; unsigned long fpfn, lpfn; @@ -124,8 +125,10 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man, r = drm_mm_insert_node_in_range_generic(&mgr->mm, node, mem->num_pages, mem->page_alignment, 0, fpfn, lpfn, sflags, aflags); - if (!r) + if (!r) { mgr->available -= mem->num_pages; + atomic64_add(mem->size, &adev->gtt_usage); + } spin_unlock(&mgr->lock); if (!r) { @@ -140,9 +143,11 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man, void amdgpu_gtt_mgr_print(struct seq_file *m, struct ttm_mem_type_manager *man) { + struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev); struct amdgpu_gtt_mgr *mgr = man->priv; - seq_printf(m, "man size:%llu pages, gtt available:%llu pages\n", - man->size, mgr->available); + seq_printf(m, "man size:%llu pages, gtt available:%llu pages, usage:%lluMB\n", + man->size, mgr->available, + (u64)atomic64_read(&adev->gtt_usage) >> 20); } /** * amdgpu_gtt_mgr_new - allocate a new node @@ -213,6 +218,7 @@ static void amdgpu_gtt_mgr_del(struct ttm_mem_type_manager *man, { struct amdgpu_gtt_mgr *mgr = man->priv; struct drm_mm_node *node = mem->mm_node; + struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev); if (!node) return; @@ -221,6 +227,7 @@ static void amdgpu_gtt_mgr_del(struct ttm_mem_type_manager *man, if (node->start != AMDGPU_BO_INVALID_OFFSET) { drm_mm_remove_node(node); mgr->available += mem->num_pages; + atomic64_sub(mem->size, &adev->gtt_usage); } spin_unlock(&mgr->lock); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 3cde1c9..2249eb6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -61,9 +61,6 @@ static void amdgpu_update_memory_usage(struct amdgpu_device *adev, if (new_mem) { switch (new_mem->mem_type) { - case TTM_PL_TT: - atomic64_add(new_mem->size, &adev->gtt_usage); - break; case TTM_PL_VRAM: atomic64_add(new_mem->size, &adev->vram_usage); vis_size = amdgpu_get_vis_part_size(adev, new_mem); @@ -80,9 +77,6 @@ static void amdgpu_update_memory_usage(struct amdgpu_device *adev, if (old_mem) { switch (old_mem->mem_type) { - case TTM_PL_TT: - atomic64_sub(old_mem->size, &adev->gtt_usage); - break; case TTM_PL_VRAM: atomic64_sub(old_mem->size, &adev->vram_usage); vis_size = amdgpu_get_vis_part_size(adev, old_mem); -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/3] drm/amdgpu: fix gtt mgr available statistics
gtt_mgr_alloc is called by many places in local driver, while gtt_mgr_new is called by get_node in ttm. Change-Id: Ia5a18a3b531a01ad7d47f40e08f778e7b94c048a Signed-off-by: Chunming Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c index 69ab2ee..8a950a5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c @@ -124,6 +124,8 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man, r = drm_mm_insert_node_in_range_generic(&mgr->mm, node, mem->num_pages, mem->page_alignment, 0, fpfn, lpfn, sflags, aflags); + if (!r) + mgr->available -= mem->num_pages; spin_unlock(&mgr->lock); if (!r) { @@ -160,7 +162,6 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man, spin_unlock(&mgr->lock); return 0; } - mgr->available -= mem->num_pages; spin_unlock(&mgr->lock); node = kzalloc(sizeof(*node), GFP_KERNEL); @@ -187,9 +188,6 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man, return 0; err_out: - spin_lock(&mgr->lock); - mgr->available += mem->num_pages; - spin_unlock(&mgr->lock); return r; } @@ -214,9 +212,10 @@ static void amdgpu_gtt_mgr_del(struct ttm_mem_type_manager *man, return; spin_lock(&mgr->lock); - if (node->start != AMDGPU_BO_INVALID_OFFSET) + if (node->start != AMDGPU_BO_INVALID_OFFSET) { drm_mm_remove_node(node); - mgr->available += mem->num_pages; + mgr->available += mem->num_pages; + } spin_unlock(&mgr->lock); kfree(node); -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: fix dead lock if any ip block resume failed in s3
Driver must free the console lock whether driver resuming successful or not. Otherwise, fb_console will be always waiting for the lock and then cause system stuck. [ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds. [ 244.405543] Tainted: G OE 4.9.0-custom #1 [ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.405541] INFO: task kworker/0:0:4 blocked for more than 120 seconds. [ 244.405543] Tainted: G OE 4.9.0-custom #1 [ 244.405544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.405550] kworker/0:0 D0 4 2 0x0008 [ 244.405559] Workqueue: events console_callback [ 244.405564] 88045a2cfc00 880462b75940 81c0e500 [ 244.405568] 880476419280 c900018f7c90 817dcf62 003c [ 244.405572] 0001 0002 880462b75940 880462b75940 [ 244.405573] Call Trace: [ 244.405580] [] ? __schedule+0x222/0x6a0 [ 244.405584] [] schedule+0x36/0x80 [ 244.405588] [] schedule_timeout+0x1fc/0x390 [ 244.405592] [] __down_common+0xa5/0xf8 [ 244.405598] [] ? put_prev_entity+0x48/0x710 [ 244.405601] [] __down+0x1d/0x1f [ 244.405606] [] down+0x41/0x50 [ 244.405611] [] console_lock+0x1a/0x40 [ 244.405614] [] console_callback+0x13/0x160 [ 244.405617] [] ? __schedule+0x22a/0x6a0 [ 244.405623] [] process_one_work+0x153/0x3f0 [ 244.405628] [] worker_thread+0x12b/0x4b0 [ 244.405633] [] ? rescuer_thread+0x350/0x350 [ 244.405637] [] kthread+0xd3/0xf0 [ 244.405641] [] ? kthread_park+0x60/0x60 [ 244.405645] [] ? kthread_park+0x60/0x60 [ 244.405649] [] ret_from_fork+0x25/0x30 Signed-off-by: Huang Rui --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 28 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bd3a0d5..abb4dcc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2280,7 +2280,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon) struct drm_connector *connector; struct amdgpu_device *adev = dev->dev_private; struct drm_crtc *crtc; - int r; + int r = 0; if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0; @@ -2292,11 +2292,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon) pci_set_power_state(dev->pdev, PCI_D0); pci_restore_state(dev->pdev); r = pci_enable_device(dev->pdev); - if (r) { - if (fbcon) - console_unlock(); - return r; - } + if (r) + goto unlock; } if (adev->is_atom_fw) amdgpu_atomfirmware_scratch_regs_restore(adev); @@ -2313,7 +2310,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon) r = amdgpu_resume(adev); if (r) { DRM_ERROR("amdgpu_resume failed (%d).\n", r); - return r; + goto unlock; } amdgpu_fence_driver_resume(adev); @@ -2324,11 +2321,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon) } r = amdgpu_late_init(adev); - if (r) { - if (fbcon) - console_unlock(); - return r; - } + if (r) + goto unlock; /* pin cursors */ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) { @@ -2349,7 +2343,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon) } r = amdgpu_amdkfd_resume(adev); if (r) - return r; + goto unlock; /* blat the mode back in */ if (fbcon) { @@ -2396,12 +2390,14 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon) dev->dev->power.disable_depth--; #endif - if (fbcon) { + if (fbcon) amdgpu_fbdev_set_suspend(adev, 0); + +unlock: + if (fbcon) console_unlock(); - } - return 0; + return r; } static bool amdgpu_check_soft_reset(struct amdgpu_device *adev) -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [RfC PATCH] drm: fourcc byteorder: brings header file comments in line with reality.
On Tue, 11 Apr 2017 13:23:53 +0200 Gerd Hoffmann wrote: > Hi, > > > > Just let know what you need tested, I should be able to turn it around > > > within a couple of days. > > > > That's part of my problem. I don't really know what should be tested. > > What do people do with their BE machines that we should avoid breaking? > > For the virtual machine use case the bar is pretty low, it's mostly > about a graphical server console. Anaconda installer. Gnome desktop > with browser and standard xorg (xterm) + gtk apps. No heavy OpenGL > stuff. No hardware acceleration, so if opengl is used then it'll be > llvmpipe. > > Right now Xorg is important. Not sure whenever wayland ever will be, > possibly the ppc64 -> ppc64le switch goes faster than the xorg -> > wayland switch. Hi, IMHO you can ignore Wayland for now I suppose, I just wanted to point out that we have similar problems there and whatever you do with the DRM format codes will affect things on Wayland too. Once you get things hashed out on an X.org based stack, we can look what it means for Wayland software. After all, BE users are scarce and allegedly favouring old software to avoid breakage; Wayland is new, and Wayland compositors still "rare", so the intersection of people using both BE and Wayland and relying on it to work is... minuscule? insignificant? I don't mean to belittle people that use Wayland on BE, but by that one bug report EGL is and probably has been broken, and it's unclear if anything has ever worked. Thanks, pq pgp1u35sx47li.pgp Description: OpenPGP digital signature ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx