Hi Thomas, On Tue, Aug 13, 2019 at 05:36:16PM +0800, Feng Tang wrote: > Hi Thomas, > > On Mon, Aug 12, 2019 at 03:25:45PM +0800, Feng Tang wrote: > > Hi Thomas, > > > > On Fri, Aug 09, 2019 at 04:12:29PM +0800, Rong Chen wrote: > > > Hi, > > > > > > >>Actually we run the benchmark as a background process, do we need to > > > >>disable the cursor and test again? > > > >There's a worker thread that updates the display from the shadow buffer. > > > >The blinking cursor periodically triggers the worker thread, but the > > > >actual update is just the size of one character. > > > > > > > >The point of the test without output is to see if the regression comes > > > >from the buffer update (i.e., the memcpy from shadow buffer to VRAM), or > > > >from the worker thread. If the regression goes away after disabling the > > > >blinking cursor, then the worker thread is the problem. If it already > > > >goes away if there's simply no output from the test, the screen update > > > >is the problem. On my machine I have to disable the blinking cursor, so > > > >I think the worker causes the performance drop. > > > > > > We disabled redirecting stdout/stderr to /dev/kmsg, and the regression is > > > gone. > > > > > > commit: > > > f1f8555dfb9 drm/bochs: Use shadow buffer for bochs framebuffer console > > > 90f479ae51a drm/mgag200: Replace struct mga_fbdev with generic > > > framebuffer > > > emulation > > > > > > f1f8555dfb9a70a2 90f479ae51afa45efab97afdde testcase/testparams/testbox > > > ---------------- -------------------------- --------------------------- > > > %stddev change %stddev > > > \ | \ > > > 43785 44481 > > > vm-scalability/300s-8T-anon-cow-seq-hugetlb/lkp-knm01 > > > 43785 44481 GEO-MEAN > > > vm-scalability.median > > > > Till now, from Rong's tests: > > 1. Disabling cursor blinking doesn't cure the regression. > > 2. Disabling printint test results to console can workaround the > > regression. > > > > Also if we set the perfer_shadown to 0, the regression is also > > gone. > > We also did some further break down for the time consumed by the > new code. > > The drm_fb_helper_dirty_work() calls sequentially > 1. drm_client_buffer_vmap (290 us) > 2. drm_fb_helper_dirty_blit_real (19240 us) > 3. helper->fb->funcs->dirty() ---> NULL for mgag200 driver > 4. drm_client_buffer_vunmap (215 us) > > The average run time is listed after the function names. > > From it, we can see drm_fb_helper_dirty_blit_real() takes too long > time (about 20ms for each run). I guess this is the root cause > of this regression, as the original code doesn't use this dirty worker. > > As said in last email, setting the prefer_shadow to 0 can avoid > the regrssion. Could it be an option?
Any comments on this? thanks - Feng > > Thanks, > Feng > > > > > --- a/drivers/gpu/drm/mgag200/mgag200_main.c > > +++ b/drivers/gpu/drm/mgag200/mgag200_main.c > > @@ -167,7 +167,7 @@ int mgag200_driver_load(struct drm_device *dev, > > unsigned long flags) > > dev->mode_config.preferred_depth = 16; > > else > > dev->mode_config.preferred_depth = 32; > > - dev->mode_config.prefer_shadow = 1; > > + dev->mode_config.prefer_shadow = 0; > > > > And from the perf data, one obvious difference is good case don't > > call drm_fb_helper_dirty_work(), while bad case calls. > > > > Thanks, > > Feng > > > > > Best Regards, > > > Rong Chen > _______________________________________________ > LKP mailing list > l...@lists.01.org > https://lists.01.org/mailman/listinfo/lkp