date:20140112

Need Features offered and comparison Chart for Solr 3.6 and Solr 4.6

2014-01-12 Thread MAYANK SOLANKI

Hi Team,

I need a details around:

1. What are the features Apache Solr Offers.

2. Comparison Chart between Apache Solr 3.6 and Solr 4.6


Please provide as soon as possible so that I can suggest it in my company
for implementation.


Regards,
Mayank Solanki

Re: using extract handler: data not extracted

2014-01-12 Thread sweety

I am working on Windows 7



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110993.html
Sent from the Solr - User mailing list archive at Nabble.com.

background merge hit exception while optimizing index (SOLR 4.4.0)

2014-01-12 Thread Ralf Matulat


Hi,
I am currently running into merge-issues while optimizing an index.

To give you some informations:

We are running 4 SOLR Servers with identical OS, VM-Hardware, RAM etc.
Only one Server by now is having issues, the others are fine.

We are running SOLR 4.4.0 with Tomcat 6.0
It was running since October without any problems.
The problems first occur after doing a minor change in the synonyms.txt, 
but I guess that was just a coincedence.


We added `ulimit -v unlimited` to our tomcat init-script years ago.

We have 4 Cores running on each SOLR Server, configuration, index-sizes 
of all 4 servers are identical (we are distributing cfgs via git).


We did a rebuild of the index twice: First time without removing the old 
index files, second time deleting the data dir and starting from scratch.


We are working with DIH, getting data from a MySQL DB.
After an initial complete index-run, the optimize is working. The 
optimize fails one or two days later.


We are doing one optimize-run a day, the index contains ~10 millions 
documents, the index size on disc is ~39GB while having 127G of free 
disc space.


We have a mergeFactor of 3.

The solr.log says:

ERROR - 2014-01-12 22:47:11.062; org.apache.solr.common.SolrException; 
java.io.IOException: background merge hit exception: 
_dc8(4.4):C9876366/1327 _e8u(4.4):C4250/7 _f4a(4.4):C1553/13 _fj6(4.4
):C1903/15 _ep3(4.4):C1217/42 _fle(4.4):C256/7 _flf(4.4):C11 into _flg 
[maxNumSegments=1]
at 
org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1714)
at 
org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1650)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:530)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1235)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1219)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)

at java.lang.Thread.run(Thread.java:735)
Caused by: java.lang.NullPointerException
at java.nio.ByteBuffer.get(ByteBuffer.java:661)
at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:245)
at 
org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:107)
at 
org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java:197)
at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.refillDocs(Lucene41PostingsReader.java:748)
at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextDoc(Lucene41PostingsReader.java:813)
at 
org.apache.lucene.codecs.MappingMultiDocsAndPositionsEnum.nextDoc(MappingMultiDocsAndPositionsEnum.java:104)
at 
org.apache.lucene.codecs.PostingsConsumer.merge(PostingsConsumer.java:101)
at 
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:164)
at 
org.apache.luc

Re: Correct to use to store urls (unicode)

2014-01-12 Thread Gora Mohanty

On 13 January 2014 00:30, Hakim Benoudjit  wrote:
>
> Yep sure. But is it good for me to store a link(http://...) in a solr
> string field? knowing that this field isnt indexed, only stored.

Yes, there is no issue. Not sure why they are not indexed, but if
that is what you want, ...

Regards,
Gora

How Solr join query works?

2014-01-12 Thread solr2020

Hi All,

Can anyone please explain how solr join query works in solr4.2. 
we have 2 different documents.Both are in the same index. 

document1 contains the columns:

docdate:  01-12-2012
previousmonthdate :01-11-2012
price:15
and some more fields.

document2 contains:

docdate :01-11-2012
previousmonthdate :01-10-2012
price:10
and some more fields.

Here we have the same value in previousmonthdate (in document1) and docdate
(in document2).So we want to  make a join query based on this to retrieve
these in a single document.

the final document should look like this.
docdate:  01-12-2012
previousmonthdate :01-11-2012
price:15
price:10(this is from document2)

Is is possible using Solr join query??? Or do we have any other approach?.

Please help..

Thanks.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-Solr-join-query-works-tp4110982.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to index data in muliValue field with key

2014-01-12 Thread rachun

thank you Mr. Steve. Now I understood and i figured out to separate field to
title_th and title_en and it worked ;)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-index-data-in-muliValue-field-with-key-tp4110653p4110981.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini

Not really sure...the issue seems related to text extraction so the first
suspect is tika...SOLR is playing a secondary role here. If Tika is doing
extraction good there should be an error, a warning on solr side (an
exception, a content field too long warning or something like that)

What about the option 3a above (finest org.apache + grep "tika")?
 On 12 Jan 2014 17:38, "sweety"  wrote:

> Sorry for the mistake.
> im using solr 4.2, it has tika-1.3.
> So now, java  -jar tika-app-1.3.jar -v C:\Coding.pdf , parses pdf document
> without error or msg.
> Also, java  -jar tika-app-1.3.jar -t C:\Coding.pdf, shows the entire
> document.
> Which means there is no problem in tika right??
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110957.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Correct to use to store urls (unicode)

2014-01-12 Thread Hakim Benoudjit

Yep sure. But is it good for me to store a link(http://...) in a solr
string field? knowing that this field isnt indexed, only stored.

2014/1/12 Gora Mohanty 

> On 12 January 2014 20:07, Hakim Benoudjit  wrote:
> > I have just forget the u'' next to a unicode string :\,
>
> Heh! Handling of Unicode in Python 2.x is annoying. 3.x
> is better, but only a little. Off-topic on this list, so I will
> shut up now.
>
> Regards,
> Gora
>

Re: using extract handler: data not extracted

2014-01-12 Thread sweety

Sorry for the mistake. 
im using solr 4.2, it has tika-1.3. 
So now, java  -jar tika-app-1.3.jar -v C:\Coding.pdf , parses pdf document
without error or msg. 
Also, java  -jar tika-app-1.3.jar -t C:\Coding.pdf, shows the entire
document. 
Which means there is no problem in tika right??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110957.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: using extract handler: data not extracted

2014-01-12 Thread sweety

Sorry for the mistake. 
im using solr 4.2, it has tika-1.3. 
So now, java  -jar tika-app-1.3.jar -v C:\Coding.pdf , parses pdf document
without error or msg. 
Also, java  -jar tika-app-1.3.jar -t C:\Coding.pdf, shows the entire
document. 
Which means there is no problem in tika right??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110954.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: using extract handler: data not extracted

2014-01-12 Thread sweety

Sorry for the mistake.
im using solr 4.2, it has tika-1.3.
So now, java  -jar tika-app-1.3.jar -v C:Coding.pdf , parses pdf document
without error or msg.
Also, java  -jar tika-app-1.4.jar* -t *C:Cloud.docx, shows the entire
document.
Which means there is no problem in tika right??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110951.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Correct to use to store urls (unicode)

2014-01-12 Thread Gora Mohanty

On 12 January 2014 20:07, Hakim Benoudjit  wrote:
> I have just forget the u'' next to a unicode string :\,

Heh! Handling of Unicode in Python 2.x is annoying. 3.x
is better, but only a little. Off-topic on this list, so I will
shut up now.

Regards,
Gora

Re: Correct to use to store urls (unicode)

2014-01-12 Thread Hakim Benoudjit

I have just forget the u'' next to a unicode string :\,


2014/1/12 Hakim Benoudjit 

> I can add this link using sunburnt (solr python client) so it can not be
> related to solr.
> I think you're right it might be a python issue.
>
> Thanks.
>
>
> 2014/1/12 Gora Mohanty 
>
>> On 12 January 2014 19:45, Hakim Benoudjit  wrote:
>> > hi,
>> >
>> > what's the correct type used to store urls, which can contain some
>> > encoded unicode caracters in the form '%'. Because, the
>> > string type returns an error when I try to store these urls.
>>
>> Please provide more details as that should not be the case.
>>
>> > Btw, I'm using a python client which gives me this error: "'ascii'
>> > codec can't decode byte 0xc3".
>>
>> That is a different issue having to do with Python, and the proper
>> handling of Unicode strings. Try searching Google, or asking on a
>> Python list.
>>
>> Regards,
>> Gora
>>
>
>

Re: Correct to use to store urls (unicode)

2014-01-12 Thread Hakim Benoudjit

I can add this link using sunburnt (solr python client) so it can not be
related to solr.
I think you're right it might be a python issue.

Thanks.


2014/1/12 Gora Mohanty 

> On 12 January 2014 19:45, Hakim Benoudjit  wrote:
> > hi,
> >
> > what's the correct type used to store urls, which can contain some
> > encoded unicode caracters in the form '%'. Because, the
> > string type returns an error when I try to store these urls.
>
> Please provide more details as that should not be the case.
>
> > Btw, I'm using a python client which gives me this error: "'ascii'
> > codec can't decode byte 0xc3".
>
> That is a different issue having to do with Python, and the proper
> handling of Unicode strings. Try searching Google, or asking on a
> Python list.
>
> Regards,
> Gora
>

Re: Correct to use to store urls (unicode)

2014-01-12 Thread Gora Mohanty

On 12 January 2014 19:45, Hakim Benoudjit  wrote:
> hi,
>
> what's the correct type used to store urls, which can contain some
> encoded unicode caracters in the form '%'. Because, the
> string type returns an error when I try to store these urls.

Please provide more details as that should not be the case.

> Btw, I'm using a python client which gives me this error: "'ascii'
> codec can't decode byte 0xc3".

That is a different issue having to do with Python, and the proper
handling of Unicode strings. Try searching Google, or asking on a
Python list.

Regards,
Gora

Correct to use to store urls (unicode)

2014-01-12 Thread Hakim Benoudjit

hi,

what's the correct type used to store urls, which can contain some
encoded unicode caracters in the form '%'. Because, the
string type returns an error when I try to store these urls.
Btw, I'm using a python client which gives me this error: "'ascii'
codec can't decode byte 0xc3".

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini

Please stay on (or clarify) your issue: in the first example you told us
the problem is with "Coding.pdf" file. What is that Cloud.docx? Why don't
you try with Coding.pdf? And what is the result of the extraction from
command line with Coding.pdf and the same tika version that is in your SOLR?

I would use the tika version shipped with SOLR. Changing that version is
obviously possible, don't know what Tika version are compatible with your
SOLR (because dependencies and last but not least: you didn't tell us which
SOLR version you are using)

On Sun, Jan 12, 2014 at 1:39 PM, sweety  wrote:

> through command line(>java  -jar tika-app-1.4.jar -v C:Cloud.docx) apache
> tika is able to parse .docx files,  so can i use this tika-app-1.4.jar in
> solr?? how to do that??
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110938.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: using extract handler: data not extracted

2014-01-12 Thread sweety

through command line(>java  -jar tika-app-1.4.jar -v C:Cloud.docx) apache
tika is able to parse .docx files,  so can i use this tika-app-1.4.jar in
solr?? how to do that??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110938.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini

A premise: as Erik explained, most probably this issue has nothing to do
with SOLR.
So, these are the options that, in my mind, you have

*OPTION #1 : Using Tika as command line tool*a) Download Tika. Make sure
the same version of your SOLR
b) Read here: http://tika.apache.org/1.4/gettingstarted.html. There a
section called "Using Tika as a command line utility", you can give it your
file, set the verbose flag and see what is the output

*OPTIONS #2  (if you are a Java dev)*a) create a simple Java project in
your workspace and set in the build path the tika libs from your SOLR
bundle.
b) Read here: http://tika.apache.org/1.4/parser_guide.html. Creates and
start a sample main parser. Here you should have a more deep control on
what happens.

*OPTIONS #3 Set the log level on SOLR*
As far as I rememberr the old (pre 4.x) version of solr listed all packages
found in classloader under the logging tab, so you could be able to set the
appropiate level for each of them.
Instead, I'm seeing that on 4.x (at least 4.4.0), after starting SOLR with
tika libs, I don't see those packages in the log tree.

You can

a) if you are using linux, set the FINEST level on the *org.apache* package
and grep the output log (otherwise you will get a lt of messages)
b) directly change the logging.properties under $JETTY_HOME/etc (or your
servlet engine log configuration files if you are not using jetty)

More on (SOLR) logging

http://wiki.apache.org/solr/SolrLogging
http://wiki.apache.org/solr/LoggingInDefaultJettySetup

Best,
Andrea

On Sun, Jan 12, 2014 at 10:13 AM, sweety  wrote:

> ya right all 3 points are right.
> Let me solve the 1 first, there is some errror in tika level indexing, for
> that i need to debug at tika level right??
>  but how to do that?? Solr admin does not show package wise logging.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110922.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Problem with starting solr

2014-01-12 Thread Rafał Kuć

Hello!

If you want to use Morfologik you need to include it jar files,
because they are in included in the standard distribution. Look at the
downloaded Solr distribution, you should find the
contrib/analysis-extras/lib directory there. Get those libraries, put
it in a directory (ie: /var/solr/ext) add a section like this to your
solrconfig.xml (you can find it in conf directory of Solr):



In addition to that you'll need the
lucene-analyzers-morfologik-4.6.0.jar from
contrib/analysis-extras/lucene-lib. Just put it the same directory.

After that restart Solr and you should be OK.

Btw - you can find the instructions (although a bit old) at:
http://solr.pl/2012/04/02/solr-4-0-i-mozliwosci-analizy-jezyka-polskiego/

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


> Hi.
> I'm total beginner to Solr. 
> When I'm trying to start it I get following errors:

> lukasz@lukasz-VirtualBox:~/PycharmProjects/solr/solr-4.6.0/example$ java
> -jar start.jar 
> 0[main] INFO  org.eclipse.jetty.server.Server  – jetty-8.1.10.v20130312
> 104  [main] INFO 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider  –
> Deployment monitor
> /home/lukasz/PycharmProjects/solr/solr-4.6.0/example/contexts at interval 0
> 214  [main] INFO  org.eclipse.jetty.deploy.DeploymentManager  – Deployable
> added:
> /home/lukasz/PycharmProjects/solr/solr-4.6.0/example/contexts/solr-jetty-context.xml
> 9566 [main] INFO 
> org.eclipse.jetty.webapp.StandardDescriptorProcessor  – NO
> JSP Support for /solr, did not find
> org.apache.jasper.servlet.JspServlet
> 9746 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  –
> SolrDispatchFilter.init()
> 9805 [main] INFO  org.apache.solr.core.SolrResourceLoader  – JNDI not
> configured for solr (NoInitialContextEx)
> 9808 [main] INFO  org.apache.solr.core.SolrResourceLoader  – solr home
> defaulted to 'solr/' (could not find system property or JNDI)
> 9809 [main] INFO  org.apache.solr.core.SolrResourceLoader  – new
> SolrResourceLoader for directory: 'solr/'
> 10116 [main] INFO  org.apache.solr.core.ConfigSolr  – Loading container
> configuration from
> /home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/solr.xml
> 10406 [main] INFO  org.apache.solr.core.CoreContainer  – New CoreContainer
> 30060911
> 10407 [main] INFO  org.apache.solr.core.CoreContainer  – Loading cores into
> CoreContainer [instanceDir=solr/]
> 10479 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting socketTimeout to: 0
> 10479 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting urlScheme to: http://
> 10480 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting connTimeout to: 0
> 10481 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting maxConnectionsPerHost to: 20
> 10483 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting corePoolSize to: 0
> 10485 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting maximumPoolSize to: 2147483647
> 10486 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting maxThreadIdleTime to: 5
> 10487 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting sizeOfQueue to: -1
> 10488 [main] INFO 
> org.apache.solr.handler.component.HttpShardHandlerFactory 
> – Setting fairnessPolicy to: false
> 10702 [main] INFO  org.apache.solr.logging.LogWatcher  – SLF4J impl is
> org.slf4j.impl.Log4jLoggerFactory
> 10704 [main] INFO  org.apache.solr.logging.LogWatcher  – Registering Log
> Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
> 10839 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.CoreContainer 
> – Creating SolrCore 'content' using instanceDir: solr/content
> 10839 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for
> directory: 'solr/content/'
> 10850 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.SolrResourceLoader  – Adding
> 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/lucene-analyzers-morfologik-4.6.0.jar'
> to classloader
> 10855 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.SolrResourceLoader  – Adding
> 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-fsa-1.8.1.jar'
> to classloader
> 10857 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.SolrResourceLoader  – Adding
> 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-polish-1.8.1.jar'
> to classloader
> 10859 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.SolrResourceLoader  – Adding
> 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-stemming-1.8.1.jar'
> to classloader
> 11005 [coreLoadExecutor-3-thread-1] INFO 
> org.apache.solr.core.S

Problem with starting solr

2014-01-12 Thread lukaszzenko

Hi.
I'm total beginner to Solr. 
When I'm trying to start it I get following errors:

lukasz@lukasz-VirtualBox:~/PycharmProjects/solr/solr-4.6.0/example$ java
-jar start.jar 
0[main] INFO  org.eclipse.jetty.server.Server  – jetty-8.1.10.v20130312
104  [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider  –
Deployment monitor
/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/contexts at interval 0
214  [main] INFO  org.eclipse.jetty.deploy.DeploymentManager  – Deployable
added:
/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/contexts/solr-jetty-context.xml
9566 [main] INFO  org.eclipse.jetty.webapp.StandardDescriptorProcessor  – NO
JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
9746 [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  –
SolrDispatchFilter.init()
9805 [main] INFO  org.apache.solr.core.SolrResourceLoader  – JNDI not
configured for solr (NoInitialContextEx)
9808 [main] INFO  org.apache.solr.core.SolrResourceLoader  – solr home
defaulted to 'solr/' (could not find system property or JNDI)
9809 [main] INFO  org.apache.solr.core.SolrResourceLoader  – new
SolrResourceLoader for directory: 'solr/'
10116 [main] INFO  org.apache.solr.core.ConfigSolr  – Loading container
configuration from
/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/solr.xml
10406 [main] INFO  org.apache.solr.core.CoreContainer  – New CoreContainer
30060911
10407 [main] INFO  org.apache.solr.core.CoreContainer  – Loading cores into
CoreContainer [instanceDir=solr/]
10479 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting socketTimeout to: 0
10479 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting urlScheme to: http://
10480 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting connTimeout to: 0
10481 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting maxConnectionsPerHost to: 20
10483 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting corePoolSize to: 0
10485 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting maximumPoolSize to: 2147483647
10486 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting maxThreadIdleTime to: 5
10487 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting sizeOfQueue to: -1
10488 [main] INFO  org.apache.solr.handler.component.HttpShardHandlerFactory 
– Setting fairnessPolicy to: false
10702 [main] INFO  org.apache.solr.logging.LogWatcher  – SLF4J impl is
org.slf4j.impl.Log4jLoggerFactory
10704 [main] INFO  org.apache.solr.logging.LogWatcher  – Registering Log
Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)]
10839 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.CoreContainer 
– Creating SolrCore 'content' using instanceDir: solr/content
10839 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for
directory: 'solr/content/'
10850 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/lucene-analyzers-morfologik-4.6.0.jar'
to classloader
10855 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-fsa-1.8.1.jar'
to classloader
10857 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-polish-1.8.1.jar'
to classloader
10859 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-stemming-1.8.1.jar'
to classloader
11005 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig  –
Adding specified lib dirs to ClassLoader
11022 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar'
to classloader
11025 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/contrib/extraction/lib/apache-mime4j-dom-0.7.2.jar'
to classloader
11027 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/contrib/extraction/lib/bcmail-jdk15-1.45.jar'
to classloader
11030 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/contrib/extraction/lib/bcprov-jdk15-1.45.jar'
to classloader
11044 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding
'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/contrib/ext

Re: using extract handler: data not extracted

2014-01-12 Thread sweety

ya right all 3 points are right.
Let me solve the 1 first, there is some errror in tika level indexing, for
that i need to debug at tika level right??
 but how to do that?? Solr admin does not show package wise logging.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110922.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: using extract handler: data not extracted

2014-01-12 Thread Andrea Gazzarini

Wait, don't confuse things...they should be three different issues:

1. with curl indexing happens but leaves the content field empty, so
probably something occurs at tika level during the text extraction. That's
the reason why I told you about the tika logging

2. with solrj ineexing doesn'happen because a syntax error. Probably there
some usage mistake

3. I don't know solrnet but linkage error is something at low level
(related to libraries and classes) so I adsume also here indexing doesn't
happen. If you don't know about those "internalities" I would suggest to
concentrate on 1) and 2).

Best,
Andres
On 12 Jan 2014 08:21, "sweety"  wrote:

> this is the output i get when indexed through* solrj*, i followed the link
> you suggested.
> i tried indexing .doc file.
> 
> 
> 400
> 17
> 
> 
> 
> org.apache.solr.search.SyntaxError: Cannot parse
> 'id:C:\solr\document\src\new_index_doc\document_1.doc': Encountered " ":"
> ":
> "" at line 1, column 4. Was expecting one of:   ...  ...
> 
> ... "+" ... "-" ...  ... "(" ... "*" ... "^" ...  ...
>  ...  ...  ...  ... 
> ...
> "[" ... "{" ...  ...  ...
> 
> 400
> 
> 
>
> Also when indexed with *solrnet*, i get this error:
> Caused by: java.lang.LinkageError: loader constraint violation: loader
> (instance of org/apache/catalina/loader/WebappClassLoader) previously
> initiated loading for a different type with name
> "org/apache/xmlbeans/XmlCursor"
> why this linkage error??
>
> Now *curl does not work, neither does solrj and solrnet.*
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/using-extract-handler-data-not-extracted-tp4110850p4110915.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Need Features offered and comparison Chart for Solr 3.6 and Solr 4.6

Re: using extract handler: data not extracted

background merge hit exception while optimizing index (SOLR 4.4.0)

Re: Correct to use to store urls (unicode)

How Solr join query works?

Re: How to index data in muliValue field with key

Re: using extract handler: data not extracted

Re: Correct to use to store urls (unicode)

Re: using extract handler: data not extracted

Re: using extract handler: data not extracted

Re: using extract handler: data not extracted

Re: Correct to use to store urls (unicode)

Re: Correct to use to store urls (unicode)

Re: Correct to use to store urls (unicode)

Re: Correct to use to store urls (unicode)

Correct to use to store urls (unicode)

Re: using extract handler: data not extracted

Re: using extract handler: data not extracted

Re: using extract handler: data not extracted

Re: Problem with starting solr

Problem with starting solr

Re: using extract handler: data not extracted

Re: using extract handler: data not extracted

23 matches

Site Navigation

Mail list logo

Footer information