Re: [Dspace-tech] Search for date in srw/u

2008-09-16 Thread Klaus Hessellund
No, Actually I have no idea if its in lucene or not. :-$ . I meant it as 
a question. To me it just looked like it was indexed somehow.

Bottom line.  Search for a specific date in srw would make the day :-)

/Klaus


LeVan,Ralph wrote:
> SRW Index Browse does not work in DSpace 1.5.  Mark Diggory has a programmer 
> working on that now.
>
> The example URL that you attached does not look like a search to me.  It 
> looks like a browse with qualifiers.  Your comment that it doesn't actually 
> return only documents with those dates reinforces that impression.
>
> Are you really sure there are year and month date indexes in Lucene?  Anyone 
> closer to the application have an opinion?
>
> Ralph
>
>   
>> -Original Message-
>> From: Klaus Hessellund [mailto:[EMAIL PROTECTED]
>> Sent: Monday, September 15, 2008 4:05 AM
>> To: LeVan,Ralph
>> Cc: Mika Stenberg; dspace-tech@lists.sourceforge.net
>> Subject: Re: [Dspace-tech] Search for date in srw/u
>>
>> Hi Ralph,
>>
>> I've tried all searches indexes and they all come with an exception.
>>
>> I've just installed 1.5.1 release instead of svn version and get same
>> result.
>>
>> Is there any way, where I can search for year and month as I can in
>> dspace ?
>>
>> f.ex if I want all post from 2005-08, i can use :
>>  http://hostname/xmlui/browse?rpp=20&etal=-
>> 1&type=dateissued&sort_by=2&order=ASC&month=12&year=2005
>>
>> So year and month are some how indexed, and the question is if I can do
>> something simelar in srw ?
>>
>> Unfortnately I can't specify day in month and dspace gives me posts
>> that
>> is newer than the specified date. That is, I get posts from the
>> specified date and later. It is better that nothing and I can strip
>> newer posts in another script. That would be a workaround for me ;-)
>>
>>
>> /Klaus
>>
>>
>>
>>
>>
>>
>> LeVan,Ralph wrote:
>> 
>>> I'm not denying responsibility for this problem, but it looks like a
>>>   
>> DSpace issue.  If you walk down that stack trace far enough, you run
>> into this:
>> 
>>> Caused by: java.lang.NoClassDefFoundError:
>>>   
>> com/ibm/icu/text/Normalizer
>> 
>>> at
>>>
>>>   
>> org.dspace.text.filter.DecomposeDiactritics.filter(DecomposeDiactritics
>> .java:54)
>> 
>>> at
>>>
>>>   
>> org.dspace.sort.AbstractTextFilterOFD.makeSortString(AbstractTextFilter
>> OFD.java:116)
>> 
>>> at
>>>   
>> org.dspace.sort.OrderFormat.makeSortString(OrderFormat.java:125)
>> 
>>> at
>>>
>>>   
>> org.dspace.browse.BrowseEngine.normalizeJumpToValue(BrowseEngine.java:7
>> 20)
>> 
>>> at
>>>   
>> org.dspace.browse.BrowseEngine.browseByValue(BrowseEngine.java:487)
>> 
>>> at org.dspace.browse.BrowseEngine.browse(BrowseEngine.java:128)
>>> at
>>>
>>>   
>> ORG.oclc.os.SRW.DSpaceLucene.SRWLuceneDatabase.getTermList(SRWLuceneDat
>> abase.java:276)
>> 
>>> In other words, I've asked DSpace to return a list of terms to me and
>>>   
>> in doing that it's invoked its DecomposeDiacritics filter and that
>> filter tried to call com.ibm.icu.text.Normalizer, which it couldn't
>> find.
>> 
>>> Now, as it happens, I believe that names are handled internally by
>>>   
>> DSpace and not left to Lucene to handle.  You might want to try
>> browsing on yet another index and see what happens.
>> 
>>> Ralph
>>>
>>>
>>>
>>>   
>>>> -Original Message-
>>>> From: Klaus Hessellund [mailto:[EMAIL PROTECTED]
>>>> Sent: Friday, September 12, 2008 10:12 AM
>>>> To: LeVan,Ralph
>>>> Cc: Mika Stenberg; dspace-tech@lists.sourceforge.net
>>>> Subject: Re: [Dspace-tech] Search for date in srw/u
>>>>
>>>> I can't browse any indexes.  If I search dc.creator for "Hansen" and
>>>> set
>>>> Response position" to 1, I get this exception :
>>>>
>>>> 
>>>> −
>>>> 
>>>> −
>>>> 
>>>> soapenv:Server.userException
>>>>
>>>> 
>> java.lang.reflect.InvocationTargetException
>> 
>>>>

Re: [Dspace-tech] Search for date in srw/u

2008-09-15 Thread LeVan,Ralph
SRW Index Browse does not work in DSpace 1.5.  Mark Diggory has a programmer 
working on that now.

The example URL that you attached does not look like a search to me.  It looks 
like a browse with qualifiers.  Your comment that it doesn't actually return 
only documents with those dates reinforces that impression.

Are you really sure there are year and month date indexes in Lucene?  Anyone 
closer to the application have an opinion?

Ralph

> -Original Message-
> From: Klaus Hessellund [mailto:[EMAIL PROTECTED]
> Sent: Monday, September 15, 2008 4:05 AM
> To: LeVan,Ralph
> Cc: Mika Stenberg; dspace-tech@lists.sourceforge.net
> Subject: Re: [Dspace-tech] Search for date in srw/u
> 
> Hi Ralph,
> 
> I've tried all searches indexes and they all come with an exception.
> 
> I've just installed 1.5.1 release instead of svn version and get same
> result.
> 
> Is there any way, where I can search for year and month as I can in
> dspace ?
> 
> f.ex if I want all post from 2005-08, i can use :
>  http://hostname/xmlui/browse?rpp=20&etal=-
> 1&type=dateissued&sort_by=2&order=ASC&month=12&year=2005
> 
> So year and month are some how indexed, and the question is if I can do
> something simelar in srw ?
> 
> Unfortnately I can't specify day in month and dspace gives me posts
> that
> is newer than the specified date. That is, I get posts from the
> specified date and later. It is better that nothing and I can strip
> newer posts in another script. That would be a workaround for me ;-)
> 
> 
> /Klaus
> 
> 
> 
> 
> 
> 
> LeVan,Ralph wrote:
> > I'm not denying responsibility for this problem, but it looks like a
> DSpace issue.  If you walk down that stack trace far enough, you run
> into this:
> > Caused by: java.lang.NoClassDefFoundError:
> com/ibm/icu/text/Normalizer
> > at
> >
> org.dspace.text.filter.DecomposeDiactritics.filter(DecomposeDiactritics
> .java:54)
> > at
> >
> org.dspace.sort.AbstractTextFilterOFD.makeSortString(AbstractTextFilter
> OFD.java:116)
> > at
> org.dspace.sort.OrderFormat.makeSortString(OrderFormat.java:125)
> > at
> >
> org.dspace.browse.BrowseEngine.normalizeJumpToValue(BrowseEngine.java:7
> 20)
> > at
> org.dspace.browse.BrowseEngine.browseByValue(BrowseEngine.java:487)
> > at org.dspace.browse.BrowseEngine.browse(BrowseEngine.java:128)
> > at
> >
> ORG.oclc.os.SRW.DSpaceLucene.SRWLuceneDatabase.getTermList(SRWLuceneDat
> abase.java:276)
> >
> > In other words, I've asked DSpace to return a list of terms to me and
> in doing that it's invoked its DecomposeDiacritics filter and that
> filter tried to call com.ibm.icu.text.Normalizer, which it couldn't
> find.
> >
> > Now, as it happens, I believe that names are handled internally by
> DSpace and not left to Lucene to handle.  You might want to try
> browsing on yet another index and see what happens.
> >
> > Ralph
> >
> >
> >
> >> -Original Message-
> >> From: Klaus Hessellund [mailto:[EMAIL PROTECTED]
> >> Sent: Friday, September 12, 2008 10:12 AM
> >> To: LeVan,Ralph
> >> Cc: Mika Stenberg; dspace-tech@lists.sourceforge.net
> >> Subject: Re: [Dspace-tech] Search for date in srw/u
> >>
> >> I can't browse any indexes.  If I search dc.creator for "Hansen" and
> >> set
> >> Response position" to 1, I get this exception :
> >>
> >> 
> >> −
> >> 
> >> −
> >> 
> >> soapenv:Server.userException
> >>
> java.lang.reflect.InvocationTargetException
> >> −
> >> 
> >> −
> >> 
> >> java.lang.reflect.InvocationTargetException
> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> at
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
> >> va:39)
> >> at
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
> >> rImpl.java:25)
> >> at java.lang.reflect.Method.invoke(Method.java:585)
> >> at
> >>
> org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.jav
> >> a:397)
> >> at
> >>
> org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.j
> >> ava:186)
> >> at
> >>
> org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:32
> >> 3)
> >> at
> >>
> org.apache.axis.strateg

Re: [Dspace-tech] Search for date in srw/u

2008-09-12 Thread LeVan,Ralph
I'm not denying responsibility for this problem, but it looks like a DSpace 
issue.  If you walk down that stack trace far enough, you run into this:
Caused by: java.lang.NoClassDefFoundError: com/ibm/icu/text/Normalizer
at
org.dspace.text.filter.DecomposeDiactritics.filter(DecomposeDiactritics.java:54)
at
org.dspace.sort.AbstractTextFilterOFD.makeSortString(AbstractTextFilterOFD.java:116)
at org.dspace.sort.OrderFormat.makeSortString(OrderFormat.java:125)
at
org.dspace.browse.BrowseEngine.normalizeJumpToValue(BrowseEngine.java:720)
at org.dspace.browse.BrowseEngine.browseByValue(BrowseEngine.java:487)
at org.dspace.browse.BrowseEngine.browse(BrowseEngine.java:128)
at
ORG.oclc.os.SRW.DSpaceLucene.SRWLuceneDatabase.getTermList(SRWLuceneDatabase.java:276)

In other words, I've asked DSpace to return a list of terms to me and in doing 
that it's invoked its DecomposeDiacritics filter and that filter tried to call 
com.ibm.icu.text.Normalizer, which it couldn't find.

Now, as it happens, I believe that names are handled internally by DSpace and 
not left to Lucene to handle.  You might want to try browsing on yet another 
index and see what happens.

Ralph


> -Original Message-
> From: Klaus Hessellund [mailto:[EMAIL PROTECTED]
> Sent: Friday, September 12, 2008 10:12 AM
> To: LeVan,Ralph
> Cc: Mika Stenberg; dspace-tech@lists.sourceforge.net
> Subject: Re: [Dspace-tech] Search for date in srw/u
> 
> I can't browse any indexes.  If I search dc.creator for "Hansen" and
> set
> Response position" to 1, I get this exception :
> 
> 
> −
> 
> −
> 
> soapenv:Server.userException
> java.lang.reflect.InvocationTargetException
> −
> 
> −
> 
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja
> va:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
> rImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:585)
> at
> org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.jav
> a:397)
> at
> org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.j
> ava:186)
> at
> org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:32
> 3)
> at
> org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.
> java:32)
> at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
> at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
> at
> org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:454)
> at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281)
> at
> ORG.oclc.os.SRW.SRWServlet.processMethodRequest(SRWServlet.java:1397)
> at ORG.oclc.os.SRW.SRWServlet.doGet(SRWServlet.java:296)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:690)
> at
> org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.
> java:327)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applic
> ationFilterChain.java:269)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFil
> terChain.java:188)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVal
> ve.java:210)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextVal
> ve.java:174)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.jav
> a:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.jav
> a:117)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve
> .java:108)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:
> 151)
> at
> org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:200)
> at
> org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:283)
> at
> org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:773)
> at
> org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java
> :703)
> at
> org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket
> .java:895)
> at
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPoo
> l.java:685)
> at java.lang.Thread.run(Thread.java:595)
> Caused by: java.lang.NoClassDefFoundError: com/ibm/icu/text/Normalizer
> at
> org.dspace.text.filter.DecomposeDiactritics.filter(DecomposeDiactritics
> .java:54)
> at
> org.dspace.sort.AbstractTextFi

Re: [Dspace-tech] Search for date in srw/u

2008-09-12 Thread Klaus Hessellund
Hi David,

thanks for your message and interest. This is very welcome as I would
really like to get hold of more insight into rsyslog's performance in
real-world and extreme scenarios. So far, I have unfortunately been
unable to do that, because I had no funding for the required hardware
and time to conduct the testing. I would *really* appreciate if you
could help with that and I would be very willing to tweak the code for
optimum performance (actually, I am always very concerned about high
performance and I was a bit sad about the fact that I could not ensure
it).

Having said that, I'd first of all would like to have a look at your
rsyslog.conf, so that we are on common ground. Also, I think, this can
become quite lengthy and also of interest for others. May it be useful
to discuss this on the web forum, so that we can easier keep track of
things. If you think this is useful, I recommend this one here:

http://kb.monitorware.com/general-f34.html

some more comments inline:

On Mon, 2008-09-15 at 07:23 +0200, [EMAIL PROTECTED] wrote:
> I've got a couple projects I'm working on that rsyslog is looking like
> a
> good answer to. I'm going to send questions about them in seperate
> messages.
> 
> the first project is a traditional syslog server. I've been testing
> various syslog implementations to find out what sort of performance
> they
> can sustain.
> 
> I've gotten the (almost) standard sysklogd to sustail almost 30,000
> messages/sec (udp) before it starts loosing significant numbers of
> logs
> 
> rsyslog is able to handle bursts of around 150,000 messages/sec, but
> it
> doesn't seem to write them out very fast, and over time seems to be
> limited to ~22,000 messages/sec
> 
> the hardware I am running this test on is insanely overpowered (four
> dual-core opterons, 16G of ram, battery-backed cache on a raid card
> with
> high-speed SAS drives). in production I will be working with slower
> hardware, but I'm trying to find the limits of the software before I
> start
> introducing lower hardware limits.
> 
> what can I look at doing to speed up the writing of data from the
> queues?
> does rsyslog write the messages one at a time, or does it have an
> option
> for writing them in batches (pulling a bunch off of the queue and
> sending
> them all at once)?

The design is that each individual message is pushed to the output. The
output than writes the message. There has been discussion about a lazy
write, but nothing in this regard has yet been implemented.

Note that this is a good time to request new things. I am finished with
the last major thing (TLS) support and in front of the next one
(enhanced scripting), but scripting, I think, could be moved to a later
time in favour of something more important (like performance ;)).
> 
> some sort of batch mode would be critical for writes to a database
> where
> you can frequently get 1000x speedups if you do the inserts as a
> single
> transaction as opposed to individual transactions.

I am not (any longer) so much a database guy. Can I really do a batch
insert with them? Can you point me to an API?

Looking forward to your reply!
Rainer


___
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog