Tim,

Thanks so much! This is wonderful information. I really appreciate your
taking the time to write all this up. I'm going to be looking into the
things you mentioned here and do a little exploring to see if I can
eliminate the problem or at least use this information to narrow down even
further what is really going on. I likely won't email back about it for
perhaps two weeks (other things going on), but I'll certainly get back to
you with whatever information that I'm able that might help toward the
stability of the AIP tool.

Thanks again!

 - Patrick

On Mon, Aug 22, 2011 at 2:09 PM, Tim Donohue <tdono...@duraspace.org> wrote:

> Hi Patrick,
>
>
> On 8/19/2011 2:33 PM, Patrick Etienne wrote:
>
>  To start, I should toss out a little system information:
>> DSpace v. 1.7.2
>> Java version "1.6.0_26"
>> PostgresSQL 8.4.7
>> All on the same RedHat box.
>>
>
> Thanks!  This all looks reasonable/normal.
>
>
>  The next important detail is that I should have removed the
>> "manifestOnly" option from the command as I've experienced the error
>> with manifestOnly set to true as well as left out (defaulted to false).
>> As a side note, this use of manifestOnly was more of an afterthought
>> (I'd noted it's experimental nature), and I'm not certain that it
>> actually does quite what I was looking for. My purpose is setting up
>> instances with data so that I can fully test themes I'm building for
>> various institutions, but I'm not really needing "content files"
>> (bitstreams), just the communities, sub-communities, collections,
>> item-pages, and /references/ to content files (having files or not
>> doesn't really matter as much, but it'd be preferable to leave out the
>> asset store). From the description in your email, it does sound indeed
>> as though setting the manifestOnly option to "true" would be good for my
>> use-case /as long as/ the feature was stable (which, as it has been
>> said, it's not as of yet). But again, the manifestOnly option is not a
>> priority (for me) at this point.
>>
>
> OK, good to know. Yes, the manifestOnly option will not give you bitstreams
> (as you expected). It just includes metadata & structure
> (communities/collections etc). However, as mentioned, it is still very
> experimental. I've admittedly never tried to migrate content using the
> manifestOnly option -- so, I'm not sure how stable the import will be.
>
> But, as the import isn't the problem right now...we can leave that for
> later.
>
>
>  It sounds as though the next piece of the puzzle might be the database
>> connection. The postgres service that I'm using for the instance(s) is
>> on the same machine as the dspace instance(s). I think that eliminates
>> network troubles as a potential cause (not that there couldn't be
>> something else going on with the postgres service).
>>
>> Next up would be a little more detail concerning the behavior of the
>> attempted command. The error does not happen immediately, it does run
>> for a while before erroring out.
>>
> <snip>
>
>  After this I ended up shutting down all the tomcat instances save
>> for the large one and also increased the db.maxconnections to 400 (just
>> to give it a whirl). This seemed to allow the AIP export to go even
>> longer before erroring out.
>>
>
> Hmm..I wonder.
>
> From the reaction after increasing 'db.maxconnections', it sounds as though
> it could be that your DSpace is just running out of database connections (as
> though connections are not being closed properly).
>
> It's possible you are encountering this bug (which will be fixed in 1.8.0):
> https://jira.duraspace.org/**browse/DS-930<https://jira.duraspace.org/browse/DS-930>
>
> The AIP Export uses the RoleCrosswalk listed in DS-930 (to export user
> permissions/groups into the DSPACE-ROLES schema). We realized there was a
> small bug in the RoleCrosswalk, where it wasn't closing its DB Connections
> properly.
>
> So, if you don't care whether you are moving user permissions/groups, you
> can turn "off" usage of the RoleCrosswalk by changing the following in your
> dspace.cfg.
>
> First, you'd want to turn off the "DSPACE-ROLES" schema export, by removing
> it from this line (also remove the comma)
>
> aip.disseminate.techMD = PREMIS, DSPACE-ROLES
>
> Second, you'd want to turn off the "METSRIGHTS" schema export, by removing
> it from this line (also remove the last comma)
>
> aip.disseminate.rightsMD = DSpaceDepositLicense:DSPACE_**DEPLICENSE, \
>    CreativeCommonsRDF:DSPACE_**CCRDF, CreativeCommonsText:DSPACE_**CCTEXT,
> METSRIGHTS
>
> What this does is disables the export of users/groups (DSPACE-ROLES) and
> permissions (METSRIGHTS) in AIPs. WARNING: in most backup scenarios, you
> would NOT want to do this, as you cannot restore users/groups/permissions if
> they are not in the AIPs. However, for your purposes of moving around test
> content, you may not care whether users/groups/permissions are moved
> successfully.
>
> More info about DSPACE-ROLES and METSRIGHTS schemas can be found at:
> https://wiki.duraspace.org/**display/DSDOC/DSpace+AIP+**Format<https://wiki.duraspace.org/display/DSDOC/DSpace+AIP+Format>
>
> If you are encountering the DS-930 bug, disabling both "DSPACE-ROLES" and
> "METSRIGHTS" may actually allow the export to succeed.  But, as stated, you
> won't have any user/groups/permissions info in your AIPs. So, when you
> re-import those AIPs, everything will just end up with generic default
> permissions (i.e. publicly accessible, but a System Administrator has rights
> to edit everything, etc)
>
> <snip>
>
>  I discovered while tweaking with the database settings was that running
>> the AIP export while tailing the log file showed that the
>> org.postgresql.util.**PSQLException popped up at least a few times for
>> each AIP export I'd attempted to do.
>>
>
> Is it that same PSQLException each time, with the same error
> message/description?  Does it pop up for every item even when you bump up
> the 'db.maxconnections'?
>
> Just curious if a large value for 'db.maxconnections' is actually having a
> beneficial effect overall. If so, that would imply that you are likely
> hitting a bug in DSpace where it isn't closing DB connections after queries
> are run (i.e. like I described above, it may be the DS-930 bug).
>
>
>
>> "Is it a specific object in DSpace that causes the error?"
>> This is a great question. I'm not sure how to answer it though (would
>> probably need a "query level" of postgres logs?). I can say that for
>> certain this was the case for an earlier error (not one that related to
>> postgres issues) that we'd encountered with the AIP tool. The cause of
>> the earlier error was that we'd somehow managed to have a couple items
>> in the instance that did not have titles. After making sure that the
>> items were removed or given dc.title (pretty sure that was the exact
>> field) values, that error was resolved. If there are specific things to
>> look for within the postgres logs, I'm sure I can get my sysadmin to
>> give me access so that I can look through them. It seems to me that this
>> might have more to do with postgres than dspace, but it also seems like
>> a reasonable possibility that dspace is opening database calls that
>> remain idle and aren't completing, or something of the like. I'm not
>> sure. Any suggestions or avenues to explore for further information
>> toward troubleshooting this issue would be greatly appreciated.
>>
>
> Actually, I forget that this is a bit more difficult in 1.7.x.  In 1.8.0
> (coming in Oct), the logging has been vastly improved. So, you can now
> determine more easily what is going on by looking at the logs (see
> https://jira.duraspace.org/**browse/DS-896<https://jira.duraspace.org/browse/DS-896>
> ).
>
> If you are "brave" you could actually apply the "Curator.patch" attached to
> DS-896, and it may provide you with more information in your dspace.log that
> could be of use in tracking down what is happening.
> (Obviously, don't do this in Production, but it sounds like everything you
> are working on is in a development environment.)
>
> As you can tell, many improvements are heading your way in 1.8.0 to make
> this a bit easier.
>
> - Tim
>



-- 
Patrick K. Etienne
Systems Analyst
Georgia Institute of Technology
Library & Information Center
(404) 385-8121
------------------------------------------------------------------------------
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to