Re: SWF content not indexed

2014-05-13 Thread Mauro Gregorio Binetti
Hi Ahmet,
thank you for your response... yes I think I need tika to do these job,
using it like an OCR.
I was trying to go deep inside it as yet.
Any other suggestion from the group is welcome.

Regads,
Mauro


On Sun, May 11, 2014 at 2:34 PM, Ahmet Arslan  wrote:

>
>
>
> Hi,
>
> Solr/lucene only deals with text. There are some other projects that
> extract text from rich documents.
> Solr-cell uses http://tika.apache.org for extraction. May be tika (or any
> other tool) already extracts text from swf?
>
>
> On Sunday, May 11, 2014 9:40 AM, Mauro Gregorio Binetti <
> maurogregorio.bine...@gmail.com> wrote:
> Hi guys,
> how can I make it possibile to index content of SWF files? I'm using Solr
> 3.6.0.
>
> Regards,
> Mauro
>


SWF content not indexed

2014-05-12 Thread Mauro Gregorio Binetti
Hi guys,
how can I make it possibile to index content of SWF files? I'm using Solr
3.6.0.

Regards,
Mauro


Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti
Yes Mohit... What you said is what I tried but I have found really big
problems in appending query with a correct syntax starting from the tracing
I have posted in the last mail... Any suggestion about encoding?
Il giorno 05/feb/2014 20:55, "Mohit Sinha"  ha
scritto:

> Hi,
>
> if you wish to execute the query at solr admin you can do so by appending
> it to the select handler
> :///select?
> eg for localhost at 8080 with solr instance name solr-4.4 core name test
> the select query would be
> localhost:8080/solr-4.4/test/select?
>
> Hope that helps!
> - Mohit Sinha
>
>
> On Thu, Feb 6, 2014 at 1:14 AM, Jack Krupansky  >wrote:
>
> > (Gulp!)
> >
> > You could also set the debug parameter (temporarily) in the defaults
> > section of your query request handler. But you still need to dump the
> text
> > of the query response.
> >
> > -- Jack Krupansky
> >
> > -Original Message- From: Mauro Gregorio Binetti
> > Sent: Wednesday, February 5, 2014 12:47 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Disable searching on ddm tika metadata
> >
> > Ok... I get it. I understood what you mean.
> > But I have some troubles in encoding query logged in the application to
> > submit to Solr Admin:
> >
> > %2B(%2B(companyId:10153)+%2B((%2B(entryClassName:com.
> > liferay.portlet.bookmarks.model.BookmarksEntry))+(%2B(
> > entryClassName:com.liferay.portlet.blogs.model.BlogsEntry))+(%2B(
> > entryClassName:com.liferay.portlet.calendar.model.CalEvent))+(%2B(
> > entryClassName:com.liferay.portlet.documentlibrary.model.
> > DLFileEntry)+%2B(status:0))+(%2B(entryClassName:com.liferay.
> > portlet.journal.model.JournalArticle)+%2B(status:0))
> > +(%2B(entryClassName:com.liferay.portlet.messageboards.
> > model.MBMessage)+%2B(discussion:false))+(%2B(entryClassName:com.liferay.
> > portlet.wiki.model.WikiPage))+(%2B(entryClassName:com.
> > liferay.portal.model.User)+%2B(status:0+%2B((-web_
> > content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> >
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> > fieldmercato:ml*)+%2BassetCategoryTitles:*attivazione*)+(-web_content/
> > Data_Scadenza:[00+TO+20140205151243]+-expando/
> >
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> > fieldmercato:m
> > l*)+%2BassetTagNames:*attivazione*)+(-web_content/
> > Data_Scadenza:[00+TO+20140205151243]+-expando/
> >
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> > fieldmercato:ml*)+%2BassetCategoryTitles:*attivazione*)+(-web_content/
> > Data_Scadenza:[00+TO+20140205151243]+-expando/
> >
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> > fieldmercato:ml*)+%2BassetTagNames:*attivazione*)
> > +(-web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Bcomments:attivazione)+(-
> > web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Bcontent:attivazione)+(-
> > web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Bdescription:attivazione)+
> > (-web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Bproperties:attivazione)+(
> > -web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Btitle:attivazione)+(-web_
> > content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Burl:attivazione)+(-web_
> > content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> >
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> > fieldmercato:ml*)+%2BuserName:*attivazione*)+(-web_content/
> > Data_Scadenza:[00+TO+20140205151243]+-expando/
> > custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> > custom_fields/fieldmercato:ml*)+%2Bddm/10308/ClimateForcast_
> > PROGRAM_ID:attivazione)+(-web_content/Data_Scadenza:[
> > 00+TO+20140205151243]+-expando/custom_fie

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti
Ahahhaha gulp is really funny :)
Back to us... Do you mean modifying solrconfig.xml?

Mauro
Il giorno 05/feb/2014 20:45, "Jack Krupansky"  ha
scritto:

> (Gulp!)
>
> You could also set the debug parameter (temporarily) in the defaults
> section of your query request handler. But you still need to dump the text
> of the query response.
>
> -- Jack Krupansky
>
> -----Original Message- From: Mauro Gregorio Binetti
> Sent: Wednesday, February 5, 2014 12:47 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Disable searching on ddm tika metadata
>
> Ok... I get it. I understood what you mean.
> But I have some troubles in encoding query logged in the application to
> submit to Solr Admin:
>
> %2B(%2B(companyId:10153)+%2B((%2B(entryClassName:com.
> liferay.portlet.bookmarks.model.BookmarksEntry))+(%2B(
> entryClassName:com.liferay.portlet.blogs.model.BlogsEntry))+(%2B(
> entryClassName:com.liferay.portlet.calendar.model.CalEvent))+(%2B(
> entryClassName:com.liferay.portlet.documentlibrary.model.
> DLFileEntry)+%2B(status:0))+(%2B(entryClassName:com.liferay.
> portlet.journal.model.JournalArticle)+%2B(status:0))
> +(%2B(entryClassName:com.liferay.portlet.messageboards.
> model.MBMessage)+%2B(discussion:false))+(%2B(entryClassName:com.liferay.
> portlet.wiki.model.WikiPage))+(%2B(entryClassName:com.
> liferay.portal.model.User)+%2B(status:0+%2B((-web_
> content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> fieldmercato:ml*)+%2BassetCategoryTitles:*attivazione*)+(-web_content/
> Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> fieldmercato:m
> l*)+%2BassetTagNames:*attivazione*)+(-web_content/
> Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> fieldmercato:ml*)+%2BassetCategoryTitles:*attivazione*)+(-web_content/
> Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> fieldmercato:ml*)+%2BassetTagNames:*attivazione*)
> +(-web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Bcomments:attivazione)+(-
> web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Bcontent:attivazione)+(-
> web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Bdescription:attivazione)+
> (-web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Bproperties:attivazione)+(
> -web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Btitle:attivazione)+(-web_
> content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Burl:attivazione)+(-web_
> content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/
> fieldmercato:ml*)+%2BuserName:*attivazione*)+(-web_content/
> Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Bddm/10308/ClimateForcast_
> PROGRAM_ID:attivazione)+(-web_content/Data_Scadenza:[
> 00+TO+20140205151243]+-expando/custom_fields/stato:1+
> %2B(web_content/Mercato:ml*+expando/custom_fields/
> fieldmercato:ml*)+%2Bddm/10308/ClimateForcast_COMMAND_
> LINE:attivazione)+(-web_content/Data_Scadenza:[00+TO+
> 20140205151243]+-expando/custom_fields/stato:1+%2B(web_
> content/Mercato:ml*+expando/custom_fields/fieldmercato:ml*
> )+%2Bddm/10308/ClimateForcast_HISTORY:attivazione)+(-web_
> content/Data_Scadenza:[00+TO+20140205151243]+-expando/
> custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/
> custom_fields/fieldmercato:ml*)+%2Bddm/10308/ClimateForcast_
> TABLE_ID:attivazione)+(-web_content/Data_Scadenza:[00+TO+
> 20140205151243]+-expando/custom_fields/stato:1+%2B(web_
> content/Mercato:ml*+expando/custom_fields/fieldmercato:ml*
> )+%2Bddm/10308/ClimateForcast_INSTITUTION:attivazione)+(-
> web_content/Data_Scadenza:[000

Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti
)+(-web_content/Data_Scadenza:[00+TO+20140205151243]+-expando/custom_fields/stato:*1*+%2B(web_content/Mercato:ml*+expando/custom_fields/fieldmercato:ml*)+%2BuserName:*attivazione*)+-entryClassName:com.liferay.portal.model.user)++-threadId:*+(web_content/Mercato:Amministratori+web_content/Mercato:ML_INT_CUC+web_content/Mercato:SMT_INT_CUC+web_content/Mercato:W_SUPER+


On Wed, Feb 5, 2014 at 3:46 PM, Jack Krupansky wrote:

> I'm not interested in the log (although maybe somebody else can spot
> something there) - it's the query response that is returned on your query
> HTTP request (XML or JSON.) The specific parameter to add to your HTTP
> query request is "&debug=true".
>
> -- Jack Krupansky
>
> From: Mauro Gregorio Binetti
> Sent: Wednesday, February 5, 2014 9:27 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Disable searching on ddm tika metadata
>
> Solr version in use in the project is 3.6. Through liferay I have set all
> search options to DEBUG and obtained log attached to this message (I have
> searched the string "attivazione").
> There are traces of jsp. The lines that make sense are start and end of
> query:
>
> 15:12:43,651 INFO [stdout] (http--0.0.0.0-80-11) Search:main_search.jsp...
>
> 15:16:16,894 INFO [org.apache.solr.core.SolrCore] (http--0.0.0.0-80-1) []
> webapp=/solr path=/select...
>
>
> So searching lasts about 4 minutes in this case!.
>
> Mauro
>
>
>
> On Wed, Feb 5, 2014 at 1:08 PM, Jack Krupansky 
> wrote:
>
>   Simply post to this mail list the timing section of the query response
> for a test query that you feel is too slow, but be sure to add the
> debug=true parameter (or debug=timing.)
>
>
>   -- Jack Krupansky
>
>   -Original Message- From: Mauro Gregorio Binetti
>
>   Sent: Wednesday, February 5, 2014 6:44 AM
>   To: solr-user@lucene.apache.org
>   Subject: Re: Disable searching on ddm tika metadata
>
>
>   Hi Jack, please can you give me some other details? Are you referring to
> a
>   tool in particular?
>
>   Mauro
>
>
>   On Wed, Feb 5, 2014 at 12:20 PM, Jack Krupansky  >wrote:
>
>
> Run some test queries with the debug=true parameter and check the
> timing
> section of the response to see what search components are consuming the
> time. Highlighting of large documents can be very slow, for example.
> Or, if
> you return the full text of a document, the raw size can slow the
> response
> to a query.
>
> -- Jack Krupansky
>
> -Original Message- From: Mauro Gregorio Binetti
> Sent: Wednesday, February 5, 2014 5:17 AM
> To: solr-user@lucene.apache.org
> Subject: Disable searching on ddm tika metadata
>
>
> Hi everybody,
> I'm a newbie and I'm working on searching performance in a project
> withou
> any type of documentation.
> I think searching is very slow because of the presence of all tika
> metadata, what do you think about it? I'm trying to disable this
> searching
> in al of these technical fields to test if it's true or not. I tried
> to,
> but I had no results till now.
> I tried in this way... modifying schema.xml with these new line:
>
> 
>
> but I continue to see all parameters (lije ddm/*) in "select params"
> done
> by Solr.
> Any hints?
>
> Thank you,
> Mauro
>
>
>
>
>


Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti
Hi Jack, please can you give me some other details? Are you referring to a
tool in particular?

Mauro


On Wed, Feb 5, 2014 at 12:20 PM, Jack Krupansky wrote:

> Run some test queries with the debug=true parameter and check the timing
> section of the response to see what search components are consuming the
> time. Highlighting of large documents can be very slow, for example. Or, if
> you return the full text of a document, the raw size can slow the response
> to a query.
>
> -- Jack Krupansky
>
> -Original Message- From: Mauro Gregorio Binetti
> Sent: Wednesday, February 5, 2014 5:17 AM
> To: solr-user@lucene.apache.org
> Subject: Disable searching on ddm tika metadata
>
>
> Hi everybody,
> I'm a newbie and I'm working on searching performance in a project withou
> any type of documentation.
> I think searching is very slow because of the presence of all tika
> metadata, what do you think about it? I'm trying to disable this searching
> in al of these technical fields to test if it's true or not. I tried to,
> but I had no results till now.
> I tried in this way... modifying schema.xml with these new line:
>
> 
>
> but I continue to see all parameters (lije ddm/*) in "select params" done
> by Solr.
> Any hints?
>
> Thank you,
> Mauro
>


Re: Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti
I'm submitting data via Liferay that uses Apache Lucene/Solr for searching
feature. Nothing is done directly on Solr.
solrconfig.xml is actually done in this way:

  

  
  text
  true
  ignored_

  
  true
  links
  ignored_

  

but I still have traces like this in "select params" (look at ddm/*):

+%2Bddm/10308/ClimateForcast_PROGRAM_ID:documento)+(-web_content/Data_Scadenza:[00+TO+20140205121824]+-expando/custom_fields/stato:1+%2B(web_content/Mercato:ml*+expando/custom_fields/fieldmercato:ml*)+%2Bddm/10308/ClimateForcast_COMMAND_LINE:documento)

You suggested to reindex, I see I can do it via Liferay but I can't say how
long it takes. Is your suggestion still valid after this elements I posted?

Mauro


On Wed, Feb 5, 2014 at 11:28 AM, Alexandre Rafalovitch
wrote:

> Did you reindex?
>
> Also, how are you submitting data? Are you using
> ExtractingRequestHandler (defined in your solrconfig.xml)?
>
> If so, there is already a mechanism for that. Just search for ignored
> in the documentation:
> http://wiki.apache.org/solr/ExtractingRequestHandler .
>
> Regards,
>Alex.
>
> Personal website: http://www.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Wed, Feb 5, 2014 at 5:17 PM, Mauro Gregorio Binetti
>  wrote:
> > Hi everybody,
> > I'm a newbie and I'm working on searching performance in a project withou
> > any type of documentation.
> > I think searching is very slow because of the presence of all tika
> > metadata, what do you think about it? I'm trying to disable this
> searching
> > in al of these technical fields to test if it's true or not. I tried to,
> > but I had no results till now.
> > I tried in this way... modifying schema.xml with these new line:
> >
> > 
> >
> > but I continue to see all parameters (lije ddm/*) in "select params" done
> > by Solr.
> > Any hints?
> >
> > Thank you,
> > Mauro
>


Disable searching on ddm tika metadata

2014-02-05 Thread Mauro Gregorio Binetti
Hi everybody,
I'm a newbie and I'm working on searching performance in a project withou
any type of documentation.
I think searching is very slow because of the presence of all tika
metadata, what do you think about it? I'm trying to disable this searching
in al of these technical fields to test if it's true or not. I tried to,
but I had no results till now.
I tried in this way... modifying schema.xml with these new line:



but I continue to see all parameters (lije ddm/*) in "select params" done
by Solr.
Any hints?

Thank you,
Mauro