Re: Solr contribs build and jar-of-jars

2012-08-19 Thread Chantal Ackermann
Hi Lance,

does this do what you want?

http://maven.apache.org/plugins/maven-assembly-plugin/descriptor-refs.html#jar-with-dependencies

It's maven but that would be an advantage I'd say… ;-)

Chantal

Am 05.08.2012 um 01:25 schrieb Lance Norskog:

> Has anybody tried packaging the contrib distribution jars in the
> jar-of-jars format? Or merging all included jars into one super-jar?
> 
> The OpenNLP contrib has a Lucene analyze, 3 external jars, and Solr
> classes. Packaging this sucker is proving painful in the extreme. UIMA
> has the same problem. 'ant' has a task for generating the manifest
> class path for a jar-of-jars, and the technique actually works:
> 
> http://ant.apache.org/manual/Tasks/manifestclasspath.html
> http://stackoverflow.com/questions/858766/generate-manifest-class-path-from-classpath-in-ant
> http://grokbase.com/t/ant/user/0213wdmn51/building-a-fileset-dynamically#20020103j47ufvwooklrovrjfdvirgohe4
> 
> If this works completely, it seems like the right way to build the
> dist/ jars for the contribs.
> 
> -- 
> Lance Norskog
> goks...@gmail.com



Re: scanned pdf with solr cell

2012-08-19 Thread Lance Norskog
The backstory here is that Tika uses a library that for some crazy
reason is inside the Java AWG graphics toolkit. (I think the RTF
parser?)

On Wed, Aug 15, 2012 at 5:57 AM, Ahmet Arslan  wrote:
>> You can try passing
>> -Djava.awt.headless=true as one of the arguments
>> when you start Jetty to see if you can get this to go away
>> with no ill
>> effects.
>
> I started jetty using : 'java -Djava.awt.headless=true -jar start.jar' and 
> successfully indexed two pdf files. That icon didn't appeared :) Thanks!



-- 
Lance Norskog
goks...@gmail.com


Re: Chinese character not encoded for facet.prefix but encoded for q field

2012-08-19 Thread Lance Norskog
Use the 'text_cjk' field type for your Chinese language text.

Chinese language search is not simple, and Solr/Lucene are almost there in
having a useable solution.

On Thu, Aug 16, 2012 at 4:23 AM, Rajani Maski  wrote:

> Chinese character not encoded for facet.prefix but encoded for q field  -
> BODY
> *
> *
> *why?what might be the problem?*
>
> This is done :
>connectionTimeout="2"
>redirectPort="8443" URIEncoding="UTF-8"/>
>
>
>
>
> [image: Inline image 2]
>
>


-- 
Lance Norskog
goks...@gmail.com


Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-19 Thread Lance Norskog
I would use generation numbers on documents, and communicate a global
generation number in ZK.

On Thu, Aug 16, 2012 at 2:22 AM, Nicholas Ball
 wrote:
>
> I've been close to implementing a 2PC protocol before for something else,
> however for this it's not needed.
> As the move operation will be done on a single node which has both the
> cores, this could be done differently. Just not entirely sure how to do it.
>
> When a commit is done at the moment, the core must get locked somehow, it
> is at this point where we should lock the other core too if a move
> operation is being executed.
>
> Nick
>
> On Thu, 16 Aug 2012 10:32:10 +0800, Li Li  wrote:
>>
> http://zookeeper.apache.org/doc/r3.3.6/recipes.html#sc_recipes_twoPhasedCommit
>>
>> On Thu, Aug 16, 2012 at 7:41 AM, Nicholas Ball
>>  wrote:
>>>
>>> Haven't managed to find a good way to do this yet. Does anyone have any
>>> ideas on how I could implement this feature?
>>> Really need to move docs across from one core to another atomically.
>>>
>>> Many thanks,
>>> Nicholas
>>>
>>> On Mon, 02 Jul 2012 04:37:12 -0600, Nicholas Ball
>>>  wrote:
 That could work, but then how do you ensure commit is called on the
> two
 cores at the exact same time?

 Cheers,
 Nicholas

 On Sat, 30 Jun 2012 16:19:31 -0700, Lance Norskog 
 wrote:
> Index all documents to both cores, but do not call commit until both
> report that indexing worked. If one of the cores throws an exception,
> call roll back on both cores.
>
> On Sat, Jun 30, 2012 at 6:50 AM, Nicholas Ball
>  wrote:
>>
>> Hey all,
>>
>> Trying to figure out the best way to perform atomic operation across
>> multiple cores on the same solr instance i.e. a multi-core
>>> environment.
>>
>> An example would be to move a set of docs from one core onto another
 core
>> and ensure that a softcommit is done as the exact same time. If one
 were
>> to
>> fail so would the other.
>> Obviously this would probably require some customization but wanted
> to
>> know what the best way to tackle this would be and where should I be
>> looking in the source.
>>
>> Many thanks for the help in advance,
>> Nicholas a.k.a. incunix



-- 
Lance Norskog
goks...@gmail.com


Re: How to make a server become a replica / leader for a collection at startup

2012-08-19 Thread Jed Glazner
Hey Mark,

Thanks for the extra effort in responding :)

Are you ok if I file a jira ticket and complete this feature on trunk?  We need 
this feature for a project.

Jed Glazner
Sr. Software Engineer
Adobe
jglaz...@adobe.com

- Reply message -
From: "Mark Miller" 
To: "solr-user@lucene.apache.org" 
Subject: How to make a server become a replica / leader for a collection at 
startup
Date: Sun, Aug 19, 2012 9:11 am



Hmm...last email was blocked from the list as spam :)

Let me try again forcing plain text:


Hey Jed,

I think what you are looking for is something I have proposed, but is not
implemented yet. We started with a fairly simple collections API since we
just wanted to make sure we had something in 4.0.

I would like it to be better though. My proposal was that when you create a
new collection with n shards and z replicas, that should be recorded in
ZooKeeper by the Overseer. The Overseer should then watch for when a new
node comes up - then a trigger a process that compares the config for the
collection against the real world - and remove or add based on that info.

I don't think it's that difficult to do, but given a lot of other things we
are working on, and the worry of destabilizing anything before the 4
release, I think it's more likely to come in a point release later. It's
not super complicated work, but there are some tricky corner cases I think.

- Mark


Re: solr-user-unsubscribe

2012-08-19 Thread Michael Della Bitta
Just FYI folks, this doesn't work. You need to send mail to
solr-user-unsubscr...@lucene.apache.org, not the list.

Michael Della Bitta


Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017
www.appinions.com
Where Influence Isn’t a Game


On Sun, Aug 19, 2012 at 12:04 PM, Adamsky, Robert
 wrote:
> solr-user-unsubscribe 


solr-user-unsubscribe

2012-08-19 Thread Adamsky, Robert
solr-user-unsubscribe 

solr-user-unsubscribe

2012-08-19 Thread Алексей Цой
solr-user-unsubscribe 


Re: How to make a server become a replica / leader for a collection at startup

2012-08-19 Thread Mark Miller
Hmm...last email was blocked from the list as spam :)

Let me try again forcing plain text:


Hey Jed,

I think what you are looking for is something I have proposed, but is not
implemented yet. We started with a fairly simple collections API since we
just wanted to make sure we had something in 4.0.

I would like it to be better though. My proposal was that when you create a
new collection with n shards and z replicas, that should be recorded in
ZooKeeper by the Overseer. The Overseer should then watch for when a new
node comes up - then a trigger a process that compares the config for the
collection against the real world - and remove or add based on that info.

I don't think it's that difficult to do, but given a lot of other things we
are working on, and the worry of destabilizing anything before the 4
release, I think it's more likely to come in a point release later. It's
not super complicated work, but there are some tricky corner cases I think.

- Mark


Re: Need Help - Solr - Sitecore integration

2012-08-19 Thread Jan Frühwacht
Hi,

solr uses common standards (Indexing and Searching through HTTP Requests
with JSON and XML as two of many possible response formats).

I think you should download solr and look through the wiki how to index
some example documents and query them (http://wiki.apache.org/solr/).
When you have done this you look at the following page and integrate solr
with a client of the following page:

http://wiki.apache.org/solr/IntegratingSolr

There are different .NET implementations for acting as a Solr client.
These implementations already provide you an API which is maybe easily
integratable in sitecore.

Kind regards,
Jan

2012/8/16 Samuthira Pandi S 

> Hi,
>
> Currently I am working as a Sitecore Developer.
> My client would like  to implement SOLR search integration on my sitecore
> application.
> I don't have idea to implement, if you have any document related to this.
> Kindly share the configure document.
>
> Thanks & regards
> Samuthirapandi.S
>
>
>
>  CAUTION - Disclaimer *
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely
> for the use of the addressee(s). If you are not the intended recipient,
> please
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys
> has taken
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> the
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS End of Disclaimer INFOSYS***
>


Re: Your opinion please concerning prod' installation

2012-08-19 Thread Jan Frühwacht
There are different possibilities to secure the server in my opinion which
depends on how you want to integrate the solr search.
Most use cases have some server side processing in between instead of
having a front-end web application directly accessing the solr servlets
(which prevents unwanted interactions with the search for the end user).
This really depends on your environment (application mainly consisting of
frontend or backend logic).

However it's important to think about the following things:

-Should the user also be able to update / delete documents (if not only the
"/select" requestHandler should be configured for the production
environment => see configuration in solrconfig.xml (
http://wiki.apache.org/solr/SolrConfigXml)).
-How many queries will there be in the environment ? Think of caching,
optimizing the index and being prepared for the load of your production
systems)
-Is there any data which you will index which the end-user should not be
able to see? (distinction between indexed and stored fields)

Hope this helps to start.

If you have more detailed questions on securing the server please be more a
bit more concise.

Kind regards,
Jan



2012/8/16 Bruno Mannina 

> Dear All,
>
> I would like to present you what I want to do to install Solr on a brand
> new production server.
>
> 1. uninstall apache-tomcat provides with the standard Ubuntu 12.04
> 2. uninstall Sun Java provides with the standard Ubuntu 12.04
>
> 3. install Java 7
> 4. install apache-tomcat
>
> 5. download Solr 3.6.1 (or do you think I can download the 4.0?)
>
> 6. secure server (ok but how)
>
> Thanks for your comment!
>
> Of course if you have a link that details these processes It will be
> great. (not one page for each step I have them)
>
> Have a nice day !
> Bruno
>
>


Re: Solr 4 dataimport problem.

2012-08-19 Thread Erick Erickson
SolrJ is completely irrelevant for using DIH,
the only time you need it would be
if you're going to write a Java program to push
data into your index. Which can be done using
a JDBC driver to connect to your SQL server
and then pushing docs to Solr, see:

http://searchhub.org/dev/2012/02/14/indexing-with-solrj/

Best
Erick



On Sat, Aug 18, 2012 at 10:22 AM, Val  wrote:
> Hi Gora,
>
> First of all thank you, and I will try to look closely at example-DIH (so I
> guess the rest of my email can be ignored), thanks!
>
> I'm using DataImportHandler as it's described here:
> http://wiki.apache.org/solr/DataImportHandler. And I have the binary
> distribution, not the compiled one.
>
> I meant that I'm using the example/ folder which is included with the
> binary Solr archive to create a DataImportHandler which imports from MySQL
> database.
>
> I'm running  "java -jar start.jar" from
> /home/my/projects/apache-solr-4.0.0-BETA/example.
>
> And, the example-DIH is really working for me. I just wanted to use the
> above folder in order to experience and try to configure the dataimport by
> myself.
>
> On the thread I mentioned, someone wrote:
> [[You need to find "apache-solr-solrj-4.0.jar" from your distribution and
> put it in the classpath somewhere. Perhaps the easiest thing is to include
> it in your core's "lib" directory.]]
> So I tried that too, and therefore I mentioned SolrJ.
>
> Thanks.
>
>
> On Sat, Aug 18, 2012 at 6:23 PM, Gora Mohanty  wrote:
>
>> On 18 August 2012 19:50, Val  wrote:
>> > Hi all,
>> >
>> > I'm having trouble using dataimport, so maybe you can help me. I've
>> > downloaded beta version of Solr 4.
>> > I already posted a question
>> > here<
>> http://stackoverflow.com/questions/12018422/classnotfoundexception-dataimport-dataimporthandler
>> >,
>> > so I don't want to repeat it. But in short:
>> > I want to import from MySQL, and I configured everything as needed. I'm
>> > getting  a DataImportHandler exception, with no more output about the
>> > nature of the error.
>> [...]
>>
>> Are you indexing data using the DataImportHandler or SolrJ?
>> The error, and your post seem to refer to DIH but the
>> StackOverflow thread mentions SolrJ libraries.
>>
>> Are you building Solr 4.0.0-BETA from source, or are you using
>> the binary distribution? Could you clarify what you mean by
>> "example folder for my MySQL DB" in your StackOverflow question,
>> i.e., please provide the filesystem path from where you are doing
>> a "java -jar start.jar". If you are in
>> apache-solr-4.0.0-BETA/example/example-DIH please read the
>> README.txt there on how to start Solr for the
>> DataImportHandler example configuration: You need to specify
>> solr.solr.home. Everything should be ready to run from the binary
>> distribution without needing to change any configuration files.
>>
>> Regards,
>> Gora
>>
>
>
>
> --
>
> Regards,
>
> Val
>
> *
> *
> * *
> *
> *
> *
> webdesignpatterns.org
> *
> *
>  | twitter 
> *