On Thu, Dec 12, 2013 at 2:22 PM, Andy Seaborne <[email protected]> wrote:

> Hi Rick,
>
> On 12/12/13 11:03, Rick Moynihan wrote:
>
>> Hi all,
>>
>> I have a script which dumps 2 modestly sized n-triples files into fuseki
>> via curl and a HTTP PUT.
>>
>> e.g. the script does the following 2 actions:
>>
>> curl -X PUT --data-binary @data/file-1.nt -H 'Content-Type: text/plain' '
>> http://localhost:3030/linkeddev-test/data?graph=http://foo-bar.org/graph1
>> '
>>
>>
> Unrelated, aesthetically better:
> Content-Type: application/n-triples
>
> (Fuseki/RIOT ignores text/plain and uses the file extension - text/plain
> is wrong so much it's unrelaible).



Good point.


>
>
>  curl -X PUT --data-binary @data/file-2.nt -H 'Content-Type: text/plain' '
>> http://localhost:3030/linkeddev-test/data?graph=http://foo-bar.org/graph2
>> '
>>
>
> And it does them one after the other, never in parallel?


Yes they're sequential, never parallel.  Is parallel update an issue?


>
>
>> File 1 is 162mb
>> File 2 is 223mb
>>
>
> so about 1.6 and 2.2 million triples?


740,000 and 1.6 million.

>
>
>  Sometimes this imports fine, other times the import takes minutes, Fuseki
>> consumes 380% CPU and I have to kill it after a few minutes.
>>
>
> When its fine, how long does it take?
>
>
Approximately 2m 40s for both datasets.


> It might be GC pressure and its GC's very hard but not making signifcant
> progress - tis can show as very high CPU, nothing happening and then
> OOME. How much heap have you given the java process?
>
> The other thing to look at memory mapped files. TDB uses mmapped files
> which are not part of the Java heap.  Don't give the Fuseki all of RAM
> for the heap - leave as much for the OS to use for file system cache as
> possible (but Fuseki still needs a decent heap to manage transactions).
>

Thanks for the advice.  Raising the heap from 1.2gb to 4gb seems to have
made the problem disappear.

>
> I assume it's a 64 bit machine but which OS? (Even amongst Linuxes
> handling of mmap varies for reason I don't understand.)
>

It's a Mac.



R.

Reply via email to