Please, help me with the new tdb 0.9.4. I am trying to load the BTC dataset from SemSearch-2011 challenge (semsearch.yahoo.com <http://semsearch.yahoo.com>) into storage via tdb 0.9.4 (jena 2.7.4).
I got the following: egor@egorov:~/semsearch-2011/dataset$ tdbloader2 --loc ../tdb btc-2009-chunk-010-urified.gz > 12:00:43 -- TDB Bulk Loader Start > 12:00:43 Data phase > INFO Load: btc-2009-chunk-010-urified.gz -- 2013/02/22 12:00:45 MSK > INFO Add: 50 000 Data (Batch: 23 529 / Avg: 23 529) > INFO Add: 100 000 Data (Batch: 43 103 / Avg: 30 441) > INFO Add: 150 000 Data (Batch: 46 685 / Avg: 34 435) > INFO Add: 200 000 Data (Batch: 85 324 / Avg: 40 469) > INFO Add: 250 000 Data (Batch: 61 349 / Avg: 43 425) > INFO Add: 300 000 Data (Batch: 98 814 / Avg: 47 900) > INFO Add: 350 000 Data (Batch: 85 034 / Avg: 51 087) > INFO Add: 400 000 Data (Batch: 102 880 / Avg: 54 518) > INFO Add: 450 000 Data (Batch: 113 378 / Avg: 57 855) > INFO Add: 500 000 Data (Batch: 81 168 / Avg: 59 566) > INFO Elapsed: 8,39 seconds [2013/02/22 12:00:53 MSK] > INFO Add: 550 000 Data (Batch: 100 000 / Avg: 61 839) > INFO Add: 600 000 Data (Batch: 92 081 / Avg: 63 579) > INFO Add: 650 000 Data (Batch: 47 892 / Avg: 62 016) > INFO Add: 700 000 Data (Batch: 97 847 / Avg: 63 682) > INFO Add: 750 000 Data (Batch: 95 602 / Avg: 65 132) > INFO Add: 800 000 Data (Batch: 100 806 / Avg: 66 605) > INFO Add: 850 000 Data (Batch: 86 805 / Avg: 67 529) > INFO Add: 900 000 Data (Batch: 103 950 / Avg: 68 870) > INFO Add: 950 000 Data (Batch: 102 040 / Avg: 70 069) > INFO Add: 1 000 000 Data (Batch: 85 034 / Avg: 70 691) > INFO Elapsed: 14,15 seconds [2013/02/22 12:00:59 MSK] > INFO Add: 1 050 000 Data (Batch: 105 708 / Avg: 71 824) > INFO Add: 1 100 000 Data (Batch: 105 708 / Avg: 72 886) > INFO Add: 1 150 000 Data (Batch: 43 327 / Avg: 70 786) > INFO Add: 1 200 000 Data (Batch: 94 876 / Avg: 71 543) > INFO Add: 1 250 000 Data (Batch: 94 517 / Avg: 72 245) > INFO Add: 1 300 000 Data (Batch: 84 602 / Avg: 72 654) > INFO Add: 1 350 000 Data (Batch: 94 517 / Avg: 73 281) > INFO Add: 1 400 000 Data (Batch: 90 909 / Avg: 73 792) > INFO Add: 1 450 000 Data (Batch: 74 850 / Avg: 73 828) > INFO Add: 1 500 000 Data (Batch: 91 074 / Avg: 74 297) > INFO Elapsed: 20,19 seconds [2013/02/22 12:01:05 MSK] > INFO Add: 1 550 000 Data (Batch: 95 238 / Avg: 74 828) > INFO Add: 1 600 000 Data (Batch: 76 335 / Avg: 74 874) > INFO Add: 1 650 000 Data (Batch: 95 602 / Avg: 75 369) > INFO Add: 1 700 000 Data (Batch: 100 401 / Avg: 75 926) > INFO Add: 1 750 000 Data (Batch: 103 519 / Avg: 76 509) > ERROR [line: 1777296, col: 106] Bad language tag > Exception in thread "main" org.openjena.riot.RiotException: [line: > 1777296, col: 106] Bad language tag > at > > org.openjena.riot.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:130) > at > org.openjena.riot.lang.LangEngine.raiseException(LangEngine.java:169) > at org.openjena.riot.lang.LangEngine.nextToken(LangEngine.java:116) > at org.openjena.riot.lang.LangNQuads.parseOne(LangNQuads.java:54) > at org.openjena.riot.lang.LangNQuads.parseOne(LangNQuads.java:34) > at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:69) > at org.openjena.riot.lang.LangBase.parse(LangBase.java:43) > at org.openjena.riot.RiotLoader.readQuads(RiotLoader.java:206) > at > > com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:168) > at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101) > at arq.cmdline.CmdMain.mainRun(CmdMain.java:63) > at arq.cmdline.CmdMain.mainRun(CmdMain.java:50) > at > > com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:79) > > > > Strange language tag is @"18". But I need to load the entire dataset > and skip this type of error. > > How to use tdbloader to ignore this type of error? > > Thank You.
