I couldn't see any info in your post regarding the bit depth that you
are using. 24bpp is very slow with these chips under XFree86. I presume
due to lack of relevant documentation. 

16bpp is much faster and the lack of quality on an LCD panel is
practically unnoticeable.

I've never used Sun so this would be the best info I could offer.
Also try setting the VideoRam in which ever config file you can to 4096
(you probably have 6MiB if it's an NM2360). This fixed problems for me
before though I'm not sure if they are still relevant.

