RE: Preparing Solr 4.2.1 for IntelliJ fails - invalid sha1

2013-04-27 Thread Shahar Davidson
Hi Steve,

Your help is much appreciated.

Turns out that all the problems that I had were proxy related. I had to 
explicitly provide the proxy configuration (host/port) to Ant. (though I 
already have been using ivy-2.3.0,  IVY-1194 was a good tip!)

That solved everything.

Thanks again,

Shahar.

-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Thursday, April 25, 2013 4:50 PM
To: solr-user@lucene.apache.org
Subject: Re: Preparing Solr 4.2.1 for IntelliJ fails - invalid sha1

Hi Shahar,

I suspect you may have an older version of Ivy installed - the errors you're 
seeing look like IVY-1194 , 
which was fixed in Ivy 2.2.0.  Lucene/Solr uses Ivy 2.3.0.  Take a look in 
C:\Users\account\.ant\lib\ and remove older versions of ivy-*.jar, then run 
'ant ivy-bootstrap' from the Solr source code to download ivy-2.3.0.jar to 
C:\Users\account\.ant\lib\.

Just now on a Windows 7 box, I downloaded solr-4.2.1-src.tgz from one of the 
Apache mirrors, unpacked it, deleted my C:\Users\account\.ivy2\ directory (so 
that ivy would re-download everything), and ran 'ant idea' from a cmd window.  
BUILD SUCCESSFUL.

Steve

On Apr 25, 2013, at 6:07 AM, Shahar Davidson  wrote:

> Hi all,
> 
> I'm trying to run 'ant idea' on 4.2.* and I'm getting "invalid sha1" 
> error messages. (see below)
> 
> I'll appreciate any help,
> 
> Shahar
> ===
> .
> .
> .
> resolve
> ivy:retrieve
> 
> :: problems summary ::
>  WARNINGS
>   problem while downloading module descriptor: 
> http://repo1.maven.org/maven2/org/apache/ant/ant/1.8.2/ant-1.8.2.pom: invalid 
> sha1: expected= computed=3e839ffb83951c79858075ddd4587bf67612b3c4 (53ms)
>   problem while downloading module descriptor: 
> http://maven.restlet.org/org/apache/ant/ant/1.8.2/ant-1.8.2.pom: invalid 
> sha1: expected= computed=3e839ffb83951c79858075ddd4587bf67612b3c4 (58ms)
> 
>   module not found: org.apache.ant#ant;1.8.2 .
> .
> .
>    public: tried
> http://repo1.maven.org/maven2/org/apache/ant/ant/1.8.2/ant-1.8.2.pom
>    sonatype-releases: tried
> 
> http://oss.sonatype.org/content/repositories/releases/org/apache/ant/ant/1.8.2/ant-1.8.2.pom
>    maven.restlet.org: tried
> http://maven.restlet.org/org/apache/ant/ant/1.8.2/ant-1.8.2.pom
>    working-chinese-mirror: tried
> 
> http://mirror.netcologne.de/maven2/org/apache/ant/ant/1.8.2/ant-1.8.2.pom
>   problem while downloading module descriptor: 
> http://repo1.maven.org/maven2/junit/junit/4.10/junit-4.10.pom: invalid sha1: 
> expected= computed=3e839ffb83951c79858075ddd4587bf67612b3c4 (60ms)
>   problem while downloading module descriptor: 
> http://maven.restlet.org/junit/junit/4.10/junit-4.10.pom: invalid sha1: 
> expected= computed=3e839ffb83951c79858075ddd4587bf67612b3c4 (60ms)
> 
>   module not found: junit#junit;4.10
> .
> .
> .
> .
>::
>   ::  UNRESOLVED DEPENDENCIES ::
>   ::
>   :: org.apache.ant#ant;1.8.2: not found
>   :: junit#junit;4.10: not found
>   :: com.carrotsearch.randomizedtesting#junit4-ant;2.0.8: not 
> found
>   :: 
> com.carrotsearch.randomizedtesting#randomizedtesting-runner;2.0.8: not found
>   ::
> 
> :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
> D:\apache_solr_4.2.1\lucene\common-build.xml:348: impossible to resolve 
> dependencies:
>   resolve failed - see output for details


Email secured by Check Point


IllegalArgumentException: port out of range:-1 When Running Embedded Zookeeper Ensemble

2013-04-27 Thread Furkan KAMACI
I am trying to run an embedded Zookeeper ensemble however I get that error:

Apr 28, 2013 1:02:48 AM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.IllegalArgumentException: port out of range:-1
at java.net.InetSocketAddress.(InetSocketAddress.java:101)
at java.net.InetSocketAddress.(InetSocketAddress.java:81)
at
org.apache.solr.cloud.SolrZkServerProps.setClientPort(SolrZkServer.java:315)

at
org.apache.solr.cloud.SolrZkServerProps.getMyServerId(SolrZkServer.java:278)

at
org.apache.solr.cloud.SolrZkServerProps.parseProperties(SolrZkServer.java:453)

at org.apache.solr.cloud.SolrZkServer.parseConfig(SolrZkServer.java:90)
at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:228)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:520)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:405)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:337)

at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:110)

at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)

at
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:724)

at
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)

at
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)

at
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:706)

at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:492)
at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)

at
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)

at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
at
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)

at
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)

at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)

at
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)

at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)

at
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)

at
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)

at
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)

at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)

at
org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)

at
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)

at
org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)

at org.eclipse.jetty.server.Server.doStart(Server.java:277)
at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)

at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1266)
at java.security.AccessController.doPrivileged(Native Method)
at org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.eclipse.jetty.start.Main.invokeMain(Main.java:472)
at org.eclipse.jetty.start.Main.start(Main.java:620)
at org.eclipse.jetty.start.Main.main(Main.java:95)

I used that:

-DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkHost=nem:9983,rem:9983,kem:9983

I read that thread:
http://lucene.472066.n3.nabble.com/solr4-0-problem-zkHost-with-multiple-hosts-throws-out-of-range-exception-td4014440.htmlI
use hostnames and when I ping I get respones to that hostnames.
However
it doesn't work. I use Solr 4.2.1.
Any ideas?


Re: Solr Indexing Rich Documents

2013-04-27 Thread Ahmet Arslan
Here is the documentation page : 
http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#filesystemrepository


- Original Message -
From: Furkan KAMACI 
To: solr-user@lucene.apache.org; Ahmet Arslan 
Cc: 
Sent: Saturday, April 27, 2013 2:48 PM
Subject: Re: Solr Indexing Rich Documents

Yes, file system

2013/4/27 Ahmet Arslan 

> hi,
>
> Where do you store your rich documents? File system?
>
>
>
>
> - Original Message -
> From: Furkan KAMACI 
> To: solr-user@lucene.apache.org
> Cc:
> Sent: Friday, April 26, 2013 6:19 PM
> Subject: Re: Solr Indexing Rich Documents
>
> Is there any example at wiki for Manifold?
>
> 2013/4/26 Ahmet Arslan 
>
> > Hi Furkan,
> >
> > post.jar meant to be used as example, quick start etc. For production
> > (incremental updates, deletes) consider using
> http://manifoldcf.apache.orgfor indexing rich documents. It utilises
> ExtractingRequestHandler feature
> > of solr.
> >
> > --- On Fri, 4/26/13, Furkan KAMACI  wrote:
> >
> > > From: Furkan KAMACI 
> > > Subject: Re: Solr Indexing Rich Documents
> > > To: solr-user@lucene.apache.org
> > > Date: Friday, April 26, 2013, 3:39 PM
> > > Thanks for the answer, I get an error
> > > now: FileNotFound Exception as I
> > > mentioned at other thread. Now I' trying to solve it.
> > >
> > > 2013/4/26 Jack Krupansky 
> > >
> > > > It's called SolrCell or the ExtractingRequestHandler
> > > (/update/extract),
> > > > which the newer post.jar knows to use for some file
> > > types:
> > > > http://wiki.apache.org/solr/ExtractingRequestHandler
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > -Original Message- From: Furkan KAMACI
> > > > Sent: Friday, April 26, 2013 4:48 AM
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Solr Indexing Rich Documents
> > > >
> > > >
> > > > I have a large corpus of rich documents i.e. pdf and
> > > doc files. I think
> > > > that I can use directly the example jar of Solr.
> > > However for a real time
> > > > environment what should I care? Also how do you send
> > > such kind of documents
> > > > into Solr to index, I think post.jar does not handle
> > > that file type?  I
> > > > should mention that I don't store documents in a
> > > database.
> > > >
> > >
> >
>
>



Re: Indexing DB

2013-04-27 Thread Gora Mohanty
On 27 April 2013 22:50, Peri Subrahmanya  wrote:
> I hooked up the dataimorthandler to my db which has around 15M records.
> When I run the full-imoport, its erroring out with java heap space
> message. Is there something I need to configure?

(a) Please do not hijack threads, but start your own. Here is
 why this is bad:
 http://people.apache.org/~hossman/#threadhijack
(b) People on mailing lists do not have access to what you are
 trying, and the issues that you are encountering. Please take
 the trouble to provide details. This might be of help:
 http://wiki.apache.org/solr/UsingMailingLists

Regards,
Gora


Re: Indexing DB

2013-04-27 Thread Peri Subrahmanya
I fixed it by setting the batchSize="-1" in the db-config.xml. Apparently
the default fetch size of 500 rows is not honored by some of the db
drivers. I was using MySQL server.

Thank you,
Peri Subrahmanya




On 4/27/13 1:20 PM, "Peri Subrahmanya"  wrote:

>I hooked up the dataimorthandler to my db which has around 15M records.
>When I run the full-imoport, its erroring out with java heap space
>message. Is there something I need to configure?
>
>Thank you,
>Peri Subrahmanya
>




*** DISCLAIMER *** This is a PRIVATE message. If you are not the intended 
recipient, please delete without copying and kindly advise us by e-mail of the 
mistake in delivery.
NOTE: Regardless of content, this e-mail shall not operate to bind HTC Global 
Services to any order or other contract unless pursuant to explicit written 
agreement or government initiative expressly permitting the use of e-mail for 
such purpose.




Content Field Incudes Some Other Fields By Tika

2013-04-27 Thread Furkan KAMACI
I have that fields at my schema.xml:






and my solrconfig:



true
content




however when I search some pdf files with Solr content starts with:

stream_content_type text/plain stream_size 959 Content-Encoding ISO-8859-1

and after that real content of file comes. Why I see them? I do not copy
and field into content field and I think I ignore any other fields that is
not defined in schema. How to remove them?


Indexing DB

2013-04-27 Thread Peri Subrahmanya
I hooked up the dataimorthandler to my db which has around 15M records.
When I run the full-imoport, its erroring out with java heap space
message. Is there something I need to configure?

Thank you,
Peri Subrahmanya



On 4/27/13 3:53 AM, "Mohsen Saboorian"  wrote:

>I have a Solr 4.2 server setup on a CentOS 6.4 x86 and Java:
>OpenJDK Runtime Environment (rhel-2.3.8.0.el6_4-i386)
>OpenJDK Server VM (build 23.7-b01, mixed mode)
>
>Currently it has a 4.6GB index with ~400k records. When I search for
>certain keywords, Solr fails with the following message.
>Any idea why this happens? Can I repair indices?
>
>1. I didn't specify any SolrDeletionPolicy. It's commented in
>solrconfig.xml as default.
>2.  is now LUCENE_42 (but it was LUCNE_41 before I
>upgrade to solr 4.2)
>
>Thanks,
>Mohsen
>
>SEVERE: null:java.io.IOException: Input/output error:
>NIOFSIndexInput(path="/app/solr/tomcat/solr/core1/data/index/_2xmx.tvd")
>at
>org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSD
>irectory.java:191)
>at
>org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:
>272)
>at
>org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.jav
>a:51)
>at
>org.apache.lucene.util.packed.BlockPackedReaderIterator.skip(BlockPackedRe
>aderIterator.java:127)
>at
>org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.readPosi
>tions(CompressingTermVectorsReader.java:586)
>at
>org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.get(Comp
>ressingTermVectorsReader.java:381)
>at
>org.apache.lucene.index.SegmentReader.getTermVectors(SegmentReader.java:17
>5)
>at
>org.apache.lucene.index.BaseCompositeReader.getTermVectors(BaseCompositeRe
>ader.java:97)
>at
>org.apache.lucene.search.highlight.TokenSources.getTokenStreamWithOffsets(
>TokenSources.java:280)
>at
>org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlight
>er(DefaultSolrHighlighter.java:453)
>at
>org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSol
>rHighlighter.java:391)
>at
>org.apache.solr.handler.component.HighlightComponent.process(HighlightComp
>onent.java:139)
>at
>org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa
>ndler.java:208)
>at
>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas
>e.java:135)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
>at
>org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
>:639)
>at
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>a:345)
>at
>org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
>a:141)
>at
>org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicati
>onFilterChain.java:243)
>at
>org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilter
>Chain.java:210)
>at
>org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.
>java:222)
>at
>org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.
>java:123)
>at
>org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:1
>71)
>at
>org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:9
>9)
>at
>org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
>at
>org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.ja
>va:118)
>at
>org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408
>)
>at
>org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Pro
>cessor.java:1009)
>at
>org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(Abstr
>actProtocol.java:589)
>at
>org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.jav
>a:310)
>at
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1145)
>at
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:615)
>at java.lang.Thread.run(Thread.java:722)
>Caused by: java.io.IOException: Input/output error
>at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
>at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:51)
>at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:222)
>at sun.nio.ch.IOUtil.read(IOUtil.java:198)
>at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:674)
>at
>org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSD
>irectory.java:176)
>... 32 more
>
>




*** DISCLAIMER *** This is a PRIVATE message. If you are not the intended 
recipient, please delete without copying and kindly advise us by e-mail of the 
mistake in delivery.
NOTE: Regardless of content, this e-mail shall not operate to bin

Re: SEVERE: null:java.io.IOException: Input/output error: NIOFSIndexInput

2013-04-27 Thread Jack Krupansky
Sounds like your index may be corrupted. Run checkindex. See similar 
discussion at:


http://lucene.472066.n3.nabble.com/index-merge-scheduler-exception-java-io-IOException-Input-output-error-td3993774.html

-- Jack Krupansky

-Original Message- 
From: Mohsen Saboorian

Sent: Saturday, April 27, 2013 3:33 AM
To: solr-user@lucene.apache.org
Subject: SEVERE: null:java.io.IOException: Input/output error: 
NIOFSIndexInput


I have a Solr 4.2 (with Lucene 4.1 index version) server setup on a CentOS
6.4 x86 and Java:
OpenJDK Runtime Environment (rhel-2.3.8.0.el6_4-i386)
OpenJDK Server VM (build 23.7-b01, mixed mode)

Currently it has a 4.6GB index with ~400k records. When I serach for certain
keywords, Solr fails with the following message:

Any idea why this happens? Can I repair indices?

Thanks.
Mohsen

SEVERE: null:java.io.IOException: Input/output error:
NIOFSIndexInput(path="/app/solr/tomcat/solr/core1/data/index/_2xmx.tvd")
   at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:191)
   at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:272)
   at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:51)
   at
org.apache.lucene.util.packed.BlockPackedReaderIterator.skip(BlockPackedReaderIterator.java:127)
   at
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.readPositions(CompressingTermVectorsReader.java:586)
   at
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.get(CompressingTermVectorsReader.java:381)
   at
org.apache.lucene.index.SegmentReader.getTermVectors(SegmentReader.java:175)
   at
org.apache.lucene.index.BaseCompositeReader.getTermVectors(BaseCompositeReader.java:97)
   at
org.apache.lucene.search.highlight.TokenSources.getTokenStreamWithOffsets(TokenSources.java:280)
   at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:453)
   at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:391)
   at
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:139)
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
   at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
   at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
   at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
   at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
   at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
   at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
   at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
   at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
   at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
   at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
   at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
   at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
   at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: Input/output error
   at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
   at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:51)
   at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:222)
   at sun.nio.ch.IOUtil.read(IOUtil.java:198)
   at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:674)
   at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:176)
   ... 32 more




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SEVERE-null-java-io-IOException-Input-output-error-NIOFSIndexInput-tp4059478.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Exclude Pattern at Dynamic Field

2013-04-27 Thread Furkan KAMACI
Yes, you are right. I used uprefix and solved what I want.

2013/4/26 Jack Krupansky 

> No, other than to be explicit about individual patterns, which is better
> anyway.
>
> Generally, "*" is a crutch or experimental tool ("Let's just see what all
> the data and metadata is and then decide what to keep"). It is better to
> use explicit patterns or static schema for production use.
>
> -- Jack Krupansky
>
> -Original Message- From: Furkan KAMACI
> Sent: Friday, April 26, 2013 11:29 AM
> To: solr-user@lucene.apache.org
> Subject: Exclude Pattern at Dynamic Field
>
>
> I use that at my Solr 4.2.1:
>
> 
>
> however can I exlude some patterns from it?
>


Re: excluding something from copyfield source?

2013-04-27 Thread Furkan KAMACI
Thanks, I used uprefix and solved my problem.

2013/4/26 Gora Mohanty 

> On 26 April 2013 20:51, Furkan KAMACI  wrote:
> > Hi;
> >
> > I use that:
> >
> > 
> >
> > however I want to exclude something i.e. author field. How can I do that?
>
> Instead of using *, use separate copyField directives for the
> fields that you want copied. You can use more restrictive globs
> also, e.g.,
> 
>
> Regards,
> Gora
>


Re: uniqueKey required false for multivalued id when indexing rich documents

2013-04-27 Thread Furkan KAMACI
Ok, thanks for the answer.

2013/4/26 Gora Mohanty 

> On 26 April 2013 18:38, Furkan KAMACI  wrote:
> > I am new to Solr and try to index rich files. I have defined that at my
> > schema:
> [...]
> > 
>
> This will not work: Please see http://wiki.apache.org/solr/UniqueKey
> for different use cases for the uniqueKey.
>
> For documents, I usually use the document name, or some segment
> of the filesystem path as the uniqueKey as that is automatically
> guaranteed to be unique.
>
> Regards,
> Gora
>


Re: Document is missing mandatory uniqueKey field: id for Solr PDF indexing

2013-04-27 Thread Furkan KAMACI
OK, I asked another question for it and it has solved, thanks.

2013/4/26 Furkan KAMACI 

> Jack, thanks for your answers. Ok, when I remove -Durl parameter I think
> it works, thanks. However I think that I have a problem with my schema. I
> get that error:
>
> Apr 26, 2013 3:52:21 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: ERROR:
> [doc=/home/ll/Desktop/b/lucene-solr-lucene_solr_4_2_1/solr/example/exampledocs/523387.pdf]
> multiple values encountered for non multiValued copy field text:
> application/pdf
>
>
> 2013/4/26 Jack Krupansky 
>
>> Maybe you are confusing things by mixing instructions - there are
>> SEPARATE instructions for directly using SolrCell and implicitly using it
>> via post.jar. Pick which you want and stick with it. DO NOT MIX the
>> instructions.
>>
>> You wrote: " I run that command: java -Durl=
>> http://localhost:8983/solr/update/extract -jar post.jar 523387.pdf"
>>
>> Was there a GOOD reason that you chose that URL?
>>
>> Best to stay with what the post.jar wiki recommends:
>>
>> Post all CSV, XML, JSON and PDF documents using AUTO mode which detects
>> type based on file name:
>>
>> java -Dauto -jar post.jar *.csv *.xml *.json *.pdf
>>
>> Or, stick with SolrCell directly, but follow its distinct instructions:
>> http://wiki.apache.org/solr/ExtractingRequestHandler
>>
>> Again, DO NOT MIX the instructions from the two.
>>
>> post.jar is designed so that you do not need to know or care exactly how
>> rich document indexing works.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Furkan KAMACI
>> Sent: Friday, April 26, 2013 5:30 AM
>> To: solr-user@lucene.apache.org
>> Subject: Document is missing mandatory uniqueKey field: id for Solr PDF
>> indexing
>>
>>
>> I use Solr 4.2.1 and these are my fields:
>>
>> > required="true"
>> multiValued="false" />
>> 
>>
>>
>> 
>> > multiValued="true"/>
>> 
>> > stored="true"/>
>> 
>> 
>> 
>> 
>> > stored="true"/>
>> 
>> > multiValued="true"/>
>> 
>> > multiValued="true"/>
>>
>> 
>> > multiValued="true"/>
>>
>>
>> 
>> 
>> 
>> > stored="false" multiValued="true"/>
>>
>> 
>> 
>>
>> 
>>
>> 
>>
>> I run that command:
>>
>> java -Durl=http://localhost:8983/solr/update/extract -jar post.jar
>> 523387.pdf
>>
>> However I get that error, any ideas?
>>
>> Apr 26, 2013 12:26:51 PM org.apache.solr.common.SolrException log
>> SEVERE: org.apache.solr.common.SolrException: Document is missing
>> mandatory
>> uniqueKey field: id
>> at
>>
>> org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88)
>> at
>>
>> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:464)
>> at
>>
>> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:346)
>> at
>>
>> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
>> at
>>
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:121)
>> at
>>
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:126)
>> at
>>
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
>> at
>>
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>> at
>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
>> at
>>
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
>> at
>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
>> at
>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
>> at
>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
>> at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
>> at
>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>> at
>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
>> at
>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>> at
>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
>> at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
>> at
>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>> at
>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
>> at
>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
>> at
>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>> at
>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
>> at
>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.han

Re: Prons an Cons of Startup Lazy a Handler?

2013-04-27 Thread Furkan KAMACI
Thanks for the answers.

2013/4/26 Chris Hostetter 

>
> : In short, whether you want to keep the handler is completely independent
> of
> : the lazy startup option.
>
> I think Jack missread your question -- my interpretation is that you are
> asking about the pros/cons of removing 'startup="lazy"' ...
>
> :  : startup="lazy"class="solr.extraction.ExtractingRequestHandler" >
> :
> : it startups it lazy. So what is pros and cons for removing it for my
> : situation?
>
> ...if you know you will definitely be using this handler, then you should
> probably remove startup="lazy" -- the advantages of using lazy request
> handlers is that there is no "init cost" for having them in your config if
> you never use them, making it handy for the example configs that many
> people copy and re-use w/o modifying so that they don't pay any price in
> having features declared that htey don't use.
>
>
> -Hoss
>


Re: Solr Indexing Rich Documents

2013-04-27 Thread Furkan KAMACI
Yes, file system

2013/4/27 Ahmet Arslan 

> hi,
>
> Where do you store your rich documents? File system?
>
>
>
>
> - Original Message -
> From: Furkan KAMACI 
> To: solr-user@lucene.apache.org
> Cc:
> Sent: Friday, April 26, 2013 6:19 PM
> Subject: Re: Solr Indexing Rich Documents
>
> Is there any example at wiki for Manifold?
>
> 2013/4/26 Ahmet Arslan 
>
> > Hi Furkan,
> >
> > post.jar meant to be used as example, quick start etc. For production
> > (incremental updates, deletes) consider using
> http://manifoldcf.apache.orgfor indexing rich documents. It utilises
> ExtractingRequestHandler feature
> > of solr.
> >
> > --- On Fri, 4/26/13, Furkan KAMACI  wrote:
> >
> > > From: Furkan KAMACI 
> > > Subject: Re: Solr Indexing Rich Documents
> > > To: solr-user@lucene.apache.org
> > > Date: Friday, April 26, 2013, 3:39 PM
> > > Thanks for the answer, I get an error
> > > now: FileNotFound Exception as I
> > > mentioned at other thread. Now I' trying to solve it.
> > >
> > > 2013/4/26 Jack Krupansky 
> > >
> > > > It's called SolrCell or the ExtractingRequestHandler
> > > (/update/extract),
> > > > which the newer post.jar knows to use for some file
> > > types:
> > > > http://wiki.apache.org/solr/ExtractingRequestHandler
> > > >
> > > > -- Jack Krupansky
> > > >
> > > > -Original Message- From: Furkan KAMACI
> > > > Sent: Friday, April 26, 2013 4:48 AM
> > > > To: solr-user@lucene.apache.org
> > > > Subject: Solr Indexing Rich Documents
> > > >
> > > >
> > > > I have a large corpus of rich documents i.e. pdf and
> > > doc files. I think
> > > > that I can use directly the example jar of Solr.
> > > However for a real time
> > > > environment what should I care? Also how do you send
> > > such kind of documents
> > > > into Solr to index, I think post.jar does not handle
> > > that file type?  I
> > > > should mention that I don't store documents in a
> > > database.
> > > >
> > >
> >
>
>


Re: Solr Indexing Rich Documents

2013-04-27 Thread Ahmet Arslan
hi,

Where do you store your rich documents? File system?




- Original Message -
From: Furkan KAMACI 
To: solr-user@lucene.apache.org
Cc: 
Sent: Friday, April 26, 2013 6:19 PM
Subject: Re: Solr Indexing Rich Documents

Is there any example at wiki for Manifold?

2013/4/26 Ahmet Arslan 

> Hi Furkan,
>
> post.jar meant to be used as example, quick start etc. For production
> (incremental updates, deletes) consider using http://manifoldcf.apache.orgfor 
> indexing rich documents. It utilises ExtractingRequestHandler feature
> of solr.
>
> --- On Fri, 4/26/13, Furkan KAMACI  wrote:
>
> > From: Furkan KAMACI 
> > Subject: Re: Solr Indexing Rich Documents
> > To: solr-user@lucene.apache.org
> > Date: Friday, April 26, 2013, 3:39 PM
> > Thanks for the answer, I get an error
> > now: FileNotFound Exception as I
> > mentioned at other thread. Now I' trying to solve it.
> >
> > 2013/4/26 Jack Krupansky 
> >
> > > It's called SolrCell or the ExtractingRequestHandler
> > (/update/extract),
> > > which the newer post.jar knows to use for some file
> > types:
> > > http://wiki.apache.org/solr/ExtractingRequestHandler
> > >
> > > -- Jack Krupansky
> > >
> > > -Original Message- From: Furkan KAMACI
> > > Sent: Friday, April 26, 2013 4:48 AM
> > > To: solr-user@lucene.apache.org
> > > Subject: Solr Indexing Rich Documents
> > >
> > >
> > > I have a large corpus of rich documents i.e. pdf and
> > doc files. I think
> > > that I can use directly the example jar of Solr.
> > However for a real time
> > > environment what should I care? Also how do you send
> > such kind of documents
> > > into Solr to index, I think post.jar does not handle
> > that file type?  I
> > > should mention that I don't store documents in a
> > database.
> > >
> >
>



Search for certain keywords fail: java.io.IOException: Input/output error: NIOFSIndexInput

2013-04-27 Thread Mohsen Saboorian
I have a Solr 4.2 server setup on a CentOS 6.4 x86 and Java:
OpenJDK Runtime Environment (rhel-2.3.8.0.el6_4-i386)
OpenJDK Server VM (build 23.7-b01, mixed mode)

Currently it has a 4.6GB index with ~400k records. When I search for
certain keywords, Solr fails with the following message.
Any idea why this happens? Can I repair indices?

1. I didn't specify any SolrDeletionPolicy. It's commented in
solrconfig.xml as default.
2.  is now LUCENE_42 (but it was LUCNE_41 before I
upgrade to solr 4.2)

Thanks,
Mohsen

SEVERE: null:java.io.IOException: Input/output error:
NIOFSIndexInput(path="/app/solr/tomcat/solr/core1/data/index/_2xmx.tvd")
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:191)
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:272)
at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:51)
at
org.apache.lucene.util.packed.BlockPackedReaderIterator.skip(BlockPackedReaderIterator.java:127)
at
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.readPositions(CompressingTermVectorsReader.java:586)
at
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.get(CompressingTermVectorsReader.java:381)
at
org.apache.lucene.index.SegmentReader.getTermVectors(SegmentReader.java:175)
at
org.apache.lucene.index.BaseCompositeReader.getTermVectors(BaseCompositeReader.java:97)
at
org.apache.lucene.search.highlight.TokenSources.getTokenStreamWithOffsets(TokenSources.java:280)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:453)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:391)
at
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:139)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:51)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:222)
at sun.nio.ch.IOUtil.read(IOUtil.java:198)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:674)
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:176)
... 32 more


SEVERE: null:java.io.IOException: Input/output error: NIOFSIndexInput

2013-04-27 Thread Mohsen Saboorian
I have a Solr 4.2 (with Lucene 4.1 index version) server setup on a CentOS
6.4 x86 and Java:
OpenJDK Runtime Environment (rhel-2.3.8.0.el6_4-i386)
OpenJDK Server VM (build 23.7-b01, mixed mode)

Currently it has a 4.6GB index with ~400k records. When I serach for certain
keywords, Solr fails with the following message:

Any idea why this happens? Can I repair indices?

Thanks.
Mohsen

SEVERE: null:java.io.IOException: Input/output error:
NIOFSIndexInput(path="/app/solr/tomcat/solr/core1/data/index/_2xmx.tvd")
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:191)
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:272)
at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:51)
at
org.apache.lucene.util.packed.BlockPackedReaderIterator.skip(BlockPackedReaderIterator.java:127)
at
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.readPositions(CompressingTermVectorsReader.java:586)
at
org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.get(CompressingTermVectorsReader.java:381)
at
org.apache.lucene.index.SegmentReader.getTermVectors(SegmentReader.java:175)
at
org.apache.lucene.index.BaseCompositeReader.getTermVectors(BaseCompositeReader.java:97)
at
org.apache.lucene.search.highlight.TokenSources.getTokenStreamWithOffsets(TokenSources.java:280)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:453)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:391)
at
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:139)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:51)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:222)
at sun.nio.ch.IOUtil.read(IOUtil.java:198)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:674)
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:176)
... 32 more




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SEVERE-null-java-io-IOException-Input-output-error-NIOFSIndexInput-tp4059478.html
Sent from the Solr - User mailing list archive at Nabble.com.