Re: API changes

Simon Helsen Thu, 17 May 2012 12:48:33 -0700

yes, doing a gc has its issues in general, especially on a server, but 
we'd only do it at the very beginning (after startup) or offline. As I 
mentioned before, it doesn't really work reliably. It only increases the 
chance the lock is released.


But for that reason, we are stuck to direct. It also worth noting that 
direct is faster on Windows, at least, it was last time we tested. On 
linux, mapped seems somewhat faster

Simon




From:
Andy Seaborne <[email protected]>
To:
Simon Helsen/Toronto/IBM@IBMCA
Cc:
[email protected], Andy Seaborne <[email protected]>
Date:
05/17/2012 03:43 PM
Subject:
Re: API changes



On 17/05/12 20:21, Simon Helsen wrote:
> Andy,
>
> the workaround you mention, what is that? I know I read somewhere that
> forcing a full GC before trying to delete it may solve it, but we
> experimented with that and it only reduced the likelihood of it
> occurring. The lock was not always released when we tried. Is that the
> workaround you know of?

The bug report at "Sun" has lots of suggestions.

I've not tried any of them - assigning null and forcing a full gc looked 
like a possibility.  But forcing a GC in some hidden piece of code has 
its own issues for a general solution.

                 Andy

>
> Simon
>
>
> From:                  Andy Seaborne <[email protected]>
> To:            Simon Helsen/Toronto/IBM@IBMCA
> Cc:            [email protected], Andy Seaborne 
<[email protected]>
> Date:                  05/17/2012 03:09 PM
> Subject:               Re: API changes
>
>
> ------------------------------------------------------------------------
>
>
>
> On 17/05/12 19:59, Simon Helsen wrote:
>  > Andy,
>  >
>  > let me pick out this comment of yours:
>  >
>  > "You may wish to submit an enhancement if you want to manipulate 
cache
>  > sizes in detail. It's been on the "it would be nice" list but it 
isn't
>  > something many people need as they run on 64bit in production. I 
wonder
>  > if parameters ought to be a related to location.
>  > "
>  >
>  > why do you say this?
>
> "Direct mode" if you like. For most people, direct = 32 bit, mapped =
> 64 bit. It's a unique feature of your deployment.
>
>  > We run only on 64 bit, but always in direct mode
>  > because mapped mode does not allow us to remove the index (as a 
reindex
>  > operation) because of the infamous JVM bug which holds a lock on the
>  > involved files, at least on Windows. Moreover, it is unstable under
>  > system crashes since the actual flush may not have taken place after 
a
>  > close.
>
> We have transactions now.
>
> (But you still can't delete datasets on Windows without using one of the
> (partial) workarounds for that bug.)
>
>  > That too is probably more a problem on Windows. The latter may be
>  > outdated now with transactions and journaling, although it has to be
>  > seen if we could still corrupt the index this way. Either way, I 
would
>  > never recommend using mapped I/O for production purposes. We tried 
and
>  > miserably failed. And yes, not all production systems are linux
>  >
>  > Anyhow, I'll look into the enhancement
>
> Great.
>
> Andy
>
>  >
>  > Simon
>  >
>  >
>  >
>  > From: Andy Seaborne <[email protected]>
>  > To: [email protected]
>  > Date: 05/17/2012 04:28 AM
>  > Subject: Re: API changes
>  >
>  >
>  > 
------------------------------------------------------------------------
>  >
>  >
>  >
>  > On 16/05/12 23:50, Simon Helsen wrote:
>  > > Andy,
>  > >
>  > > I know you asked a while back about what API changes I had noticed.
> There
>  > > are a number of things that have changed since 2.6.3,
>  >
>  > October 2010
>  >
>  > > most of them are
>  > > smaller and manageable (albeit annoying). Here a few examples:
>  > >
>  > > 1) DatasetGraphTDB subset = TDBFactory.createDatasetGraph();
>  > >
>  > > now has to be
>  > >
>  > > DatasetGraph subset = TDBFactory.createDatasetGraph();
>  >
>  > DatasetGraphTDB is an internal class (and it's not DatasetGraphTDB
> now :-)
>  >
>  > >
>  > > 2) Another example is:
>  > >
>  > > queryContext.setDataset(new DatasetImpl(subset));
>  > >
>  > > now has to be
>  > >
>  > > queryContext.setDataset(DatasetImpl.wrap(subset));
>  >
>  > I don't understand this one - what's queryContext?
>  > The only setDataset is on ExecutionContext in
>  >
>  > com.hp.hpl.jena.sparql.engine
>  >
>  > which is inside the engine and not an API call.
>  >
>  > There is QueryExecutionFactory for creating a QueryExecution for 
dataset.
>  >
>  > The call you want is
>  > DatasetFactory.create(DatasetGraph)
>  >
>  > DatasetImpl is, well, an Impl class.
>  >
>  > > 3) TupleIndex newIndex = SetupTDB.makeTupleIndex(tmpLocation,
>  > > srcGraph.getConfig(), primary, name, name, indexRecordLen);
>  > >
>  > > became
>  > >
>  > > TupleIndex newIndex = SetupTDB.makeTupleIndex(tmpLocation, primary,
> name,
>  > > name, indexRecordLen);
>  >
>  > Doing anything with internal classes is at your own risk!
>  >
>  > >
>  > > 4) More obscure is the following change:
>  > >
>  > > SetupTDB.globalConfig.setProperty(Names.pBlockReadCacheSize,
>  > > blockReadCacheSize.toString());
>  >
>  > SetupTDB is a wrapper to old code.
>  >
>  > SystemParams has the constants as finals.
>  >
>  > See SetupTDB for setting constant from properties.
>  >
>  > You may wish to submit an enhancement if you want to manipulate cache
>  > sizes in detail. It's been on the "it would be nice" list but it 
isn't
>  > something many people need as they run on 64bit in production. I 
wonder
>  > if parameters ought to be a related to location.
>  >
>  > >
>  > > That does not work anymore. It is unclear where I can set this 
property
>  > > now. I could use help with this one
>  > >
>  > >
>  > > In general, API changes are unavoidable, but they should be 
documented
>  > > per release in a readme.
>  > >
>  > > Simon
>  > >
>  > >
>  >
>  > Andy
>  >
>  >
>  >
>
>
>

Re: API changes

Reply via email to