I would convert them to UTF-8 before posting and use UTF-8 in your application.
Most of the web and applications use UTF-8. If you use other encodings you will
always run into problems.
> Am 08.11.2019 um 07:47 schrieb lala :
>
> I am using the /update/extract request handler to push documents
I am using the /update/extract request handler to push documents into solr,
but some text documents, that are encoded as windows-1255 (arabic texts) are
not extracted properly, the text given is not readable.
I searched in the web, and solr documentation and found nothing. I need to
send the file
You can test the standalone content extraction with the tika-app.jar -
Command to output in text format -
java -jar tika-app-0.8.jar --text file_path
For more options java -jar tika-app-0.8.jar --help
Use the correct tika-app version jar matching the Solr build.
Regards,
Jayendra
On Wed, Aug
es/default/files/nodefiles/533/June 30, 2011.xltm* to Solr "0"
Status: Communication Error".
I am looking for some help in figuring out where to troubleshoot this. I
assume it's this file, but I guess I'd like to be sure - so how can I submit
this file for content extr
In case the exact problem was not clear to somebody:
The problem with FileUpload interpreting file data as regular form fields is
that, Solr thinks there are no content streams in the request and throws a
"missing_content_stream" exception.
On Thu, Mar 10, 2011 at 10:59 AM, Karthik Shiraly <
karth
Hi,
I'm using Solr 1.4.1.
The scenario involves user uploading multiple files. These have content
extracted using SolrCell, then indexed by Solr along with other information
about the user.
ContentStreamUpdateRequest seemed like the right choice for this - use
addFile() to send file data, and use
Hi Erik
I did a post with more details yesterday with no response.
I have a screen shot of what it does: http://screencast.com/t/MGRiZTU5M
After running it I have done a query with 0 results and have checked to see how
many docs are indexed with 0 being the value.
Hope you can shed some more l
You really have to provide more details of
a> what you did.
b> what the results were.
Have you looked at you r index with the admin page and/or Luke?
Have you tried querying in the admin page?
Have you examined the logs to see what they report?
Best
Erick
On Fri, Feb 26, 2010 at 7:54 AM, Lee Smi
Hey All
Hope someone can advise.
I followed the example in the wiki on how to extract a html page i.e
curl
'http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=attr_content&commit=true'
-F "myfi...@tutorial.html"
And it displayed a html page but with a 404 and