Hi Likasz - good news that you can get through the whole dump now :-) I'd like to get a list of the changes you made to the reader if that's possible?
On Thu, Jan 14, 2010 at 5:21 PM, Lukasz <[email protected]> wrote: > Hi Steve, Stuart, > > In last two days I was trying to move further. > > I tried to prepare some workaround for incorrect length. > First idea was to utilize the fact that integer overflow probably took > place, so I tried to search for 0x2c just at overflow points (probably quite > brittle approach). But it turned out, that in my heap dumps (both real one > and prepared for development purposes) there is no 0x2c byte after/at the > end of the dump. > I've noticed however that for both dumps the heap record ends at the end of > the file. Probably it is not general rule, but for my purposes it should be > enough, so I changed kato sources slightly to utilize this and decided to > move further. > > So far I have some code which extracts static data from dump (strings, > classes) and searches for instances of given class. > I have run it against real dump. It took around 40 min (to get static data) > and 50 min (to find instances), which is not bad I think. > I didn't made careful reasoning about performance but CPU (not I/O) looked > to be a bottleneck. Performance also looked to be quite stable across whole > dump. > > 60GB dump was extracted for core file using jmap (I don't remember exactly > from which version of java this jmap came from, but for sure it was Java > 1.6). > Following JVM version was dumped to core file: > java version "1.6.0_10" > Java(TM) SE Runtime Environment (build 1.6.0_10-b33) > Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode) > > I was also suspecting that heap dump segments were introduced to handle > +4GB heap sizes, but it looks like sometimes it can not be a case. > > Anyway, it is nice that I was able to go through whole dump without any > error. > > Regards > Lukasz > > > Steve Poole wrote: > >> Hi Lukasz, >> >> You say your hprof format is 1.0.1 so , according to the information I >> have about hprof format, that means your won't have any heap dump >> segments >> in your file. Apparently they are in the 1.0.2 format which is a Java 6 >> addition and is only used when the heap is too big for the old format to >> cope with. >> >> What JVM and OS are you using to produce this dump? >> >> On Tue, Jan 12, 2010 at 6:17 PM, Stuart Monteith <[email protected]> >> wrote: >> >> >> >>> Hi, >>> >>> >>> You are quite right, 4 bytes for the record length. If we had the file >>> format wrong for 64 bit >>> hprof dumps we'd not be able to read any dump, so it must be an overflow >>> of >>> the record length. >>> In that case, the only possible solution I can think of would be for us >>> to >>> change to code to search >>> for the HEAP DUMP END record (tagged 0x2c) to detect the end of the HEAP >>> DUMP. >>> >>> Of course, there was some puzzling over the difference between HEAP DUMP >>> SEGMENT (0x1c) and HEAP DUMP >>> (0x0c) before. >>> >>> I'd expect we'd get: >>> >>> HEAP DUMP >>> heap dump records.... >>> HEAP DUMP END >>> >>> or >>> >>> HEAP DUMP SEGMENT >>> heap dump records >>> HEAP DUMP SEGMENT >>> heap dump records >>> : : >>> HEAP DUMP END >>> >>> >>> If we ignore the record size for the HEAP DUMP and calculate it >>> ourselves, >>> we might have more luck. >>> >>> Regards, >>> Stuart >>> >>> >>> >>> Lukasz wrote: >>> >>> >>> >>>> Hello, >>>> >>>> Since performance looks not bad after changing array size, today I have >>>> tried to process my 60GB heap dump. Unfortunately, very fast I've ended >>>> with >>>> exception: >>>> Exception in thread "main" java.lang.IllegalArgumentException: range >>>> -1293366370 is less than 0 >>>> at >>>> >>>> org.apache.kato.common.SubsetDataProvider.<init>(SubsetDataProvider.java:29) >>>> >>>> at >>>> >>>> org.apache.kato.hprof.datalayer.HProfFile$HPROFRecordProvider.getCurrentElement(HProfFile.java:487) >>>> >>>> at >>>> >>>> org.apache.kato.hprof.datalayer.HProfFile$HPROFRecordProvider.getCurrentElement(HProfFile.java:437) >>>> >>>> at >>>> >>>> org.apache.kato.common.BitMaskMappingArray.get(BitMaskMappingArray.java:56) >>>> at >>>> org.apache.kato.hprof.datalayer.HProfFile.getRecord(HProfFile.java:784) >>>> ... (my code) ... >>>> >>>> It looks like range of HeapDumpHProfRecord had value: -1293366370. >>>> >>>> Following is the part of the dump which I believes is responsible for >>>> that >>>> (B2 E8 CB 9E): >>>> 0055DD84 25 00 00 2A │ B7 AE 22 06 │ 80 00 00 00 │ 01 00 00 2A │ B7 AE >>>> 20 >>>> 0B │ 88 02 00 00 │ 00 00 00 00 │ 00 18 00 00 │ 23 26 00 00 >>>> 0055DDA8 2A B7 B0 A1 │ 64 28 00 00 │ 00 01 00 00 │ 2A B7 B0 33 │ EA F8 >>>> 0C >>>> 00 │ 00 00 00 B2 │ E8 CB 9E 20 │ 00 00 2A B7 │ B0 DE AF C8 >>>> 0055DDCC 00 00 00 01 │ 00 00 2A B7 │ AE 22 0F 20 │ 00 00 2A AA │ BE 21 >>>> 3D >>>> 28 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 2A AA │ BE 25 B7 D8 >>>> >>>> What is interesting hprof file format reserves only 4 bytes for length >>>> of >>>> a record, which allows to express record whose size is at most 4GB >>>> (assuming >>>> unsigned 4 bytes). >>>> >>>> My dump's version is: JAVA PROFILE 1.0.1 >>>> >>>> Regards >>>> Lukasz >>>> >>>> >>>> Lukasz wrote: >>>> >>>> >>>> >>>>> Hi Stuart, Steve, >>>>> >>>>> I've taken deeper look into code. I still didn't trace carefully index >>>>> calculation in classes BitMaskMappingArray and >>>>> ArrayBitMaskMappingStrategy, >>>>> but I managed to improve performance by increasing arrays size in those >>>>> classes (which is set in HProfFile class). >>>>> >>>>> If I understand code correctly, when capacity of BitMaskMappingArray >>>>> will >>>>> be exhausted bucketSize is doubled, which in turn causes that more >>>>> reads >>>>> (even cached) is required to set position of >>>>> IDataProvider/IArrayEntryProvider. >>>>> >>>>> Following are loading time results for default array size (1000) and >>>>> increased (1000000). Test ran against generated dump file (5000000 >>>>> instances >>>>> of Data). >>>>> Default (1000): >>>>> HeapSubRecord: 100000 (866ms, 4215kB) >>>>> HeapSubRecord: 200000 (1716ms, 7879kB) >>>>> HeapSubRecord: 300000 (2833ms, 11263kB) >>>>> HeapSubRecord: 400000 (3889ms, 14283kB) >>>>> HeapSubRecord: 500000 (3893ms, 17319kB) >>>>> HeapSubRecord: 600000 (7248ms, 20479kB) (here probably buckedSize was >>>>> doubled) >>>>> HeapSubRecord: 700000 (7721ms, 23531kB) >>>>> HeapSubRecord: 800000 (7729ms, 26567kB) >>>>> HeapSubRecord: 900000 (7731ms, 29671kB) >>>>> HeapSubRecord: 1000000 (7704ms, 32731kB) >>>>> ... (I didn't wait until end) >>>>> >>>>> Increased(1000000): >>>>> HeapSubRecord: 100000 (622ms, 17809kB) >>>>> HeapSubRecord: 200000 (309ms, 20345kB) >>>>> HeapSubRecord: 300000 (283ms, 23861kB) >>>>> HeapSubRecord: 400000 (274ms, 27921kB) >>>>> HeapSubRecord: 500000 (269ms, 29957kB) >>>>> HeapSubRecord: 600000 (264ms, 31993kB) >>>>> HeapSubRecord: 700000 (272ms, 36097kB) >>>>> HeapSubRecord: 800000 (288ms, 37739kB) >>>>> HeapSubRecord: 900000 (263ms, 39835kB) >>>>> HeapSubRecord: 1000000 (259ms, 41931kB) >>>>> HeapSubRecord: 1100000 (300ms, 44773kB) >>>>> HeapSubRecord: 1200000 (283ms, 46901kB) >>>>> HeapSubRecord: 1300000 (291ms, 49029kB) >>>>> HeapSubRecord: 1400000 (328ms, 53801kB) >>>>> HeapSubRecord: 1500000 (259ms, 53801kB) >>>>> HeapSubRecord: 1600000 (272ms, 58125kB) >>>>> HeapSubRecord: 1700000 (264ms, 60293kB) >>>>> HeapSubRecord: 1800000 (264ms, 62473kB) >>>>> HeapSubRecord: 1900000 (361ms, 61373kB) >>>>> HeapSubRecord: 2000000 (274ms, 63105kB) >>>>> ... >>>>> HeapSubRecord: 9000000 (284ms, 231969kB) >>>>> HeapSubRecord: 9100000 (272ms, 233597kB) >>>>> HeapSubRecord: 9200000 (281ms, 236357kB) >>>>> HeapSubRecord: 9300000 (274ms, 240469kB) >>>>> HeapSubRecord: 9400000 (279ms, 244541kB) >>>>> HeapSubRecord: 9500000 (269ms, 246549kB) >>>>> HeapSubRecord: 9600000 (279ms, 250565kB) >>>>> HeapSubRecord: 9700000 (265ms, 252573kB) >>>>> HeapSubRecord: 9800000 (279ms, 256629kB) >>>>> HeapSubRecord: 9900000 (265ms, 258669kB) >>>>> HeapSubRecord: 10000000 (463ms, 263997kB) >>>>> (end) >>>>> >>>>> i.e. my 60GB dump file contains more than 1 100 000 000 of objects (if >>>>> I >>>>> remember correctly). >>>>> >>>>> Regards >>>>> Lukasz >>>>> >>>>> >>>>> Stuart Monteith wrote: >>>>> >>>>> >>>>> >>>>>> The hprof dump reader spends a lot of time reading the whole file, for >>>>>> various reason. >>>>>> The indices it has in memory are constructed through an initial read, >>>>>> and this is also >>>>>> the source of the memory usage. In addition, there is some correlation >>>>>> to be done which >>>>>> also takes up time, and induces yet more reading. >>>>>> >>>>>> I'm sure some work could be done to improve the performance further, >>>>>> but >>>>>> we'll have to >>>>>> look at the tradeoff between diskspace and memory usage. The hprof >>>>>> file >>>>>> format itself >>>>>> is what it is, however, and we have no influence over that. The CJVMTI >>>>>> agent is has lots of >>>>>> room for improvement, but I suspect its potential for improvement is >>>>>> unlikely to be much better >>>>>> than existing hprof implementations. The built-in JVM hprof dumper >>>>>> will >>>>>> probably be a hard act >>>>>> to follow. >>>>>> >>>>>> The HProf implementation is not thread-safe. Realistically, I think it >>>>>> is something that ought to >>>>>> be considered once things are more mature. There will be algorithms >>>>>> that >>>>>> can deal with the JVM >>>>>> structure sensible. >>>>>> >>>>>> And thanks Lukasz, it's great to have your input. >>>>>> >>>>>> Regards, >>>>>> Stuart >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> -- >>> Stuart Monteith >>> http://blog.stoo.me.uk/ >>> >>> >>> >>> >> >> >> >> > > -- Steve
