[ https://issues.apache.org/jira/browse/LUCENE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134204#comment-14134204 ]
Dawid Weiss commented on LUCENE-5949: ------------------------------------- Very cool. I just needed it very recently and had to inspect stuff manually. > Add Accountable.getChildResources() > ----------------------------------- > > Key: LUCENE-5949 > URL: https://issues.apache.org/jira/browse/LUCENE-5949 > Project: Lucene - Core > Issue Type: Task > Reporter: Robert Muir > Attachments: LUCENE-5949.patch > > > Since Lucene 4.5, you can see how much memory lucene is using at a basic > level by looking at SegmentReader.ramBytesUsed() > In 4.11 its already improved, you can pull the codec producers and get ram > usage split out by postings, norms, docvalues, stored fields, term vectors, > etc. > Unfortunately most toString's are fairly useless, so you don't have any > insight further than that, even though behind the scenes its mostly just > adding up other Accountables. > So instead if we can improve the toString's, and if an Accountable can return > its children, we can connect all the dots and you can easily diagnose/debug > issues and see what is going on. I know i've been frustrated with having to > hack up tons of System.out.printlns during development to see this stuff. > So I think we should add this method to Accountable: > {code} > /** > * Returns nested resources of this class. > * The result should be a point-in-time snapshot (to avoid race conditions). > * @see Accountables > */ > // TODO: on java8 make this a default method returning emptyList > Iterable<? extends Accountable> getChildResources(); > {code} > We can also add a simple helper method for quick debugging > {{Accountables.toString(Accountable)}} to print the "tree", example output > for a lucene segment: > {noformat} > _5f(5.0.0):C8330469: 36.4 MB > |-- postings [PerFieldPostings(formats=1)]: 8 MB > |-- format 'Lucene41_0' > [BlockTreeTermsReader(fields=6,delegate=Lucene41PostingsReader(positions=true,payloads=false))]: > 8 MB > |-- field 'alternatenames' > [BlockTreeTerms(terms=3360242,postings=13779349,positions=17102250,docs=2876726)]: > 945.2 KB > |-- term index > [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false,nodes=23318,arcs=66497)]: > 945.1 KB > |-- field 'asciiname' > [BlockTreeTerms(terms=2451266,postings=16849659,positions=16891234,docs=8329981)]: > 686.1 KB > |-- term index > [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false,nodes=12976,arcs=44103)]: > 686 KB > |-- field 'geonameid' > [BlockTreeTerms(terms=8363399,postings=33321876,positions=-1,docs=8330469)]: > 1.3 MB > |-- term index > [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false,nodes=528,arcs=66225)]: > 1.3 MB > |-- field 'latitude' > [BlockTreeTerms(terms=8714542,postings=33321876,positions=-1,docs=8330469)]: > 1.7 MB > |-- term index > [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false,nodes=854,arcs=77300)]: > 1.7 MB > |-- field 'longitude' > [BlockTreeTerms(terms=11557222,postings=33321876,positions=-1,docs=8330469)]: > 2.6 MB > |-- term index > [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false,nodes=1577,arcs=114570)]: > 2.6 MB > |-- field 'name' > [BlockTreeTerms(terms=2598879,postings=16833071,positions=16874267,docs=8330325)]: > 771.5 KB > |-- term index > [FST(input=BYTE1,output=ByteSequenceOutputs,packed=false,nodes=13790,arcs=46514)]: > 771.3 KB > |-- delegate [Lucene41PostingsReader(positions=true,payloads=false)]: > 32 bytes > |-- norms [Lucene49NormsProducer(fields=3,active=3)]: 15.9 MB > |-- field 'alternatenames' [byte array]: 7.9 MB > |-- field 'asciiname' [table compressed > [Packed64SingleBlock4(bitsPerValue=4,size=8330469,blocks=520655)]]: 4 MB > |-- field 'name' [table compressed > [Packed64SingleBlock4(bitsPerValue=4,size=8330469,blocks=520655)]]: 4 MB > |-- docvalues [PerFieldDocValues(formats=1)]: 12.1 MB > |-- format 'Lucene410_0' [Lucene410DocValuesProducer(fields=5)]: 12.1 MB > |-- addresses field 'alternatenames' > [MonotonicBlockPackedReader(blocksize=16384,size=407026,avgBPV=16)]: 808.5 KB > |-- addresses field 'asciiname' > [MonotonicBlockPackedReader(blocksize=16384,size=330528,avgBPV=17)]: 698.6 KB > |-- addresses field 'name' > [MonotonicBlockPackedReader(blocksize=16384,size=335020,avgBPV=17)]: 703.7 KB > |-- ord index field 'alternatenames' > [MonotonicBlockPackedReader(blocksize=16384,size=8330470,avgBPV=9)]: 9.8 MB > |-- reverse index field 'alternatenames' > [ReverseTermsIndex(size=6360)]: 77.9 KB > |-- term bytes [PagedBytes(blocksize=32768)]: 67.7 KB > |-- term addresses > [MonotonicBlockPackedReader(blocksize=16384,size=6360,avgBPV=13)]: 10.2 KB > |-- reverse index field 'asciiname' [ReverseTermsIndex(size=5165)]: > 60.1 KB > |-- term bytes [PagedBytes(blocksize=32768)]: 53 KB > |-- term addresses > [MonotonicBlockPackedReader(blocksize=16384,size=5165,avgBPV=11)]: 7 KB > |-- reverse index field 'name' [ReverseTermsIndex(size=5235)]: 61.2 KB > |-- term bytes [PagedBytes(blocksize=32768)]: 54.1 KB > |-- term addresses > [MonotonicBlockPackedReader(blocksize=16384,size=5235,avgBPV=11)]: 7.1 KB > |-- stored fields [CompressingStoredFieldsReader(mode=FAST,chunksize=16384)]: > 216.3 KB > |-- stored field index [CompressingStoredFieldsIndexReader(blocks=65)]: > 216.3 KB > |-- doc base deltas: 55.8 KB > |-- start pointer deltas: 158.9 KB > |-- term vectors [CompressingTermVectorsReader(mode=FAST,chunksize=4096)]: > 224 KB > |-- term vector index [CompressingStoredFieldsIndexReader(blocks=67)]: > 224 KB > |-- doc base deltas: 65.6 KB > |-- start pointer deltas: 156.8 KB > {noformat} > Note this works for any accountable, so also e.g. NRTCachingDirectory, > OrdinalMap, Suggesters, FSTs, and so on. You can also e.g. traverse the graph > yourself and output whatever you want. > To be safe, I define that the graph returned is "point in time snapshot" and > free of race conditions, and the Accountable helper methods provide this and > also prevent access (even via cast) to datastructures you shouldn't be able > to get to, just provide information. > Since we aren't on java 8 yet (and cannot provide a simple default method), > instead I think we should just add the method to Accountable, but add default > emptyList() implementations to impacted datastructures such as DocIDSet and > Suggester. For codec APIs, these are lower level, and there I think its best > to leave the method abstract since they should really be providing useful > information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org