After adding a list of stop words - removed the top 100 - the full text index works perfectly.
Thanks for suggestions! Lars 2015-06-28 0:03 GMT+02:00 Christian Grün <christian.gr...@gmail.com>: > Hi Lars, > > It looks as if the input data is indeed too large to be indexed (the > internal id lists seem to exceed the maximum array size in main > memory). The usual alternative to make it work is to distribute your > document(s) into multiple databases. > > If you want, you can also provide us with the input data, but I assume > it will take pretty much space? > > Best, > Christian > > > Sat, Jun 27, 2015 at 12:50 PM, Lars Johnsen <yoon...@gmail.com> wrote: > > When trying to to a full text index on a collection of texts, the process > > runs for a couple of hours with the exit message below - I think it is > near > > completed. From the GUI, I have at least seen the progress bar get to > around > > 80 %, so I think it is safe to assume that the error is connectedt the > final > > stages. > > > > The texts are unstructured and represented as one line pr. book. Here is > the > > result from the index process. Parameters set in GUI are: Norwegian > > Snowball, lemmatization, diacritics. There is set aside 30GB for the GUI. > > > > Path summary: > > doc(): 317259x, strings > > text: 317259x, leaf > > text(): 317259x, strings, leaf > > > > Here is the error message: > > > > Improper use? Potential bug? Your feedback is welcome: > > Contact: basex-talk@mailman.uni-konstanz.de > > Version: BaseX 8.2 beta 7d38949 > > Java: Oracle Corporation, 1.7.0_79 > > OS: Linux, amd64 > > Stack Trace: > > java.lang.NegativeArraySizeException > > at java.util.Arrays.copyOf(Arrays.java:2271) > > at org.basex.util.TokenBuilder.add(TokenBuilder.java:303) > > at org.basex.util.TokenBuilder.add(TokenBuilder.java:290) > > at org.basex.index.ft.FTBuilder.merge(FTBuilder.java:248) > > at org.basex.index.ft.FTBuilder.write(FTBuilder.java:155) > > at org.basex.index.ft.FTBuilder.index(FTBuilder.java:94) > > at org.basex.index.ft.FTBuilder.build(FTBuilder.java:102) > > at org.basex.index.ft.FTBuilder.build(FTBuilder.java:1) > > at org.basex.data.DiskData.createIndex(DiskData.java:195) > > at org.basex.core.cmd.ACreate.create(ACreate.java:117) > > at org.basex.core.cmd.CreateIndex.run(CreateIndex.java:62) > > at org.basex.core.Command.run(Command.java:398) > > at org.basex.core.Command.execute(Command.java:100) > > at org.basex.core.Command.execute(Command.java:123) > > at org.basex.gui.dialog.DialogProgress$1.run(DialogProgress.java:178) > > > > Regards > > Lars G Johnsen > > National Library of Norway >