Upgrade to Tika 0.7
---
Key: SOLR-1819
URL: https://issues.apache.org/jira/browse/SOLR-1819
Project: Solr
Issue Type: Improvement
Reporter: Tricia Williams
Assignee: Grant Ingersoll
[
https://issues.apache.org/jira/browse/SOLR-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742526#action_12742526
]
Tricia Williams commented on SOLR-1235:
---
This commit causes the example-DIH to
Thanks. I just realized that this type of information is included in
the NOTICE.txt file.
Shalin Shekhar Mangar wrote:
Solr is using the woodstox implementation:
1. wstx-asl-3.2.7.jar
2. geronimo-stax-api_1.0_spec-1.0.1.jar
Hi Folks,
I think I've identified a bug in stax-api-1.0.jar. Specifically
javax.xml.stream.XMLStreamReader::getTextStart(). I'm just wonder who
owns/manages the source of the implementation that Solr uses?
https://stax-utils.dev.java.net/ gives four implementations, I just need
to know w
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665195#action_12665195
]
Tricia Williams commented on SOLR-380:
--
Hi Laurent,
Thanks for your interest i
[
https://issues.apache.org/jira/browse/SOLR-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647476#action_12647476
]
Tricia Williams commented on SOLR-854:
--
Running the example is something I freque
Hi,
I am just wondering what the status of these issues (SOLR-139 and
SOLR-828) is for supporting updateable/modifiable documents. Specifically
* Is anyone is clinging to an update of the SOLR-139 patch?
(https://issues.apache.org/jira/secure/attachment/12379550/Eriks-ModifiableD
[
https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641694#action_12641694
]
Tricia Williams commented on SOLR-532:
--
Thanks Grant. That's much cleaner
ement? Would
this same problem crop up in normal (or even abnormal) usage of Solr
deployed on a similar system?
Tricia
Yonik Seeley wrote:
OK, I've now scaled back the test by a factor of 10 (50 segments
instead of 500).
-Yonik
On Tue, Jul 29, 2008 at 4:53 PM, Tricia Williams
<[EM
As of revision 680834 the DirectUpdateHandlerOptimizeTest is still
failing. I haven't made any changes to the file handle limit on my machine.
Tricia
Yonik Seeley wrote:
I just committed a fix that will make the test use the compound file
format. Hopefully that will be sufficient.
-Yonik
This same thing happens to me since DirectUpdateHandlerOptimizeTest was
added to the repository.
How does one increase the file handle limit in ubuntu?
Thanks,
Tricia
Shalin Shekhar Mangar wrote:
Yes, it happens on a fresh checkout too.
cat /proc/sys/fs/file-max gives 204979 on my box. The t
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Attachment: xmlpayload-example.zip
xmlpayload-example.zip contains a specialized version of the
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Attachment: xmlpayload.jar
xmlpayload.jar is the deployable jar that can be dropped into your
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Attachment: xmlpayload-src.jar
xmlpayload-src.jar contains the source files and junit test and ant
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591873#action_12591873
]
Tricia Williams commented on SOLR-380:
--
After a lengthy absence I've returne
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Attachment: (was: lucene-core-2.3-dev.jar)
> There's no way to convert search resu
[
https://issues.apache.org/jira/browse/SOLR-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-522:
-
Attachment: SOLR-522-analysis.jsp.patch
Modified patch uses Yonik's suggestion:
As a useful
Replies to several comments in this thread inline:
Grant Ingersoll wrote:
Yes, that is definitely the case, but I think Tricia was more getting
at how to use them for display, i.e deserializing them into a String
or whatever. I still have on my plate that I want to figure out how
to incorpora
Hi,
I think that displaying the payload (if one exists) of each token in
the analysis.jsp would be beneficial. My simple solution was to add a
row to the existing table, convert the Payload byte array to a String
and simple print the results. I opened SOLR-522 to this effect.
There i
[
https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-532:
-
Attachment: SOLR-532-WordDelimiterFilter.patch
Quick fix. Does this need a unit test to go with
WordDelimiterFilter ignores payloads
Key: SOLR-532
URL: https://issues.apache.org/jira/browse/SOLR-532
Project: Solr
Issue Type: Bug
Reporter: Tricia Williams
Priority: Minor
Yonik Seeley wrote:
On Thu, Apr 3, 2008 at 11:46 AM, Tricia Williams
<[EMAIL PROTECTED]> wrote:
When a WordDelimiterFilter ingests a token stream and creates a new token
(newTok) it appears to copy most of the old token attributes, except the
payload. I believe this is a bu
Hi,
When a WordDelimiterFilter ingests a token stream and creates a new
token (newTok) it appears to copy most of the old token attributes,
except the payload. I believe this is a bug. My solution is for the
WordDelimiterFilter to use the Token clone() method to create a carbon
copy and
[
https://issues.apache.org/jira/browse/SOLR-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-522:
-
Attachment: SOLR-522-analysis.jsp.patch
Added if block to analysis.jsp which converts the
Solr
Issue Type: Improvement
Components: web gui
Reporter: Tricia Williams
Priority: Trivial
Add payload content to the vebose output of the analysis.jsp page for debugging
purposes.
--
This message is automatically generated by JIRA.
-
You can reply to
Thanks for clearing that up. I understand the design decision behind
not using Iterable now.
Tricia
Yonik Seeley wrote:
On Thu, Mar 20, 2008 at 4:58 PM, Tricia Williams
<[EMAIL PROTECTED]> wrote:
Is your advice not to use DocSet, or not to use Iterable?
Not to use Iterable
I wouldn't use it myself for most things... boxing each integer in a
big set is a nice waste of CPU.
-Yonik
On Thu, Mar 20, 2008 at 3:48 PM, Tricia Williams
<[EMAIL PROTECTED]> wrote:
In my custom search component I'm using the DocSet
(http://lucene.apache.org/solr/
Hi,
In my custom search component I'm using the DocSet
(http://lucene.apache.org/solr/api/org/apache/solr/search/DocSet.html)
supplied by a ResponseBuilder to iterate over TermPositions matched by
the users query and output the payload at each position.
If DocSet implemented the Iterable
[
https://issues.apache.org/jira/browse/SOLR-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-386:
-
Attachment: SOLR-386-SolrHighlighter.patch
OK. So I think I fixed the whitespace problem.
Thanks
y---JIRA seems
down at the moment so I can't check).
cheers,
-Mike
On 5-Mar-08, at 2:49 PM, Tricia Williams wrote:
Thanks to Grant and Mike for the feedback! It is much appreciated.
Is there a quick and easy way to check for unnecessary whitespace
changes? It isn't that hard for
Thanks to Grant and Mike for the feedback! It is much appreciated. Is
there a quick and easy way to check for unnecessary whitespace changes?
It isn't that hard for me to go through the patch by hand to find and
remove them, but if there is an easier way I'm happy to hear it.
I had taken th
Hi All,
Just a quick reminder that I'd really appreciate some feedback on
the patch I built for SOLR-386
(https://issues.apache.org/jira/browse/SOLR-386). I'm really interested
in contributing to Solr, and this is my first stab at the giving back
part. Being new to this I'm still trying
[
https://issues.apache.org/jira/browse/SOLR-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-386:
-
Attachment: SOLR-386-SolrHighlighter.patch
I'd really like some feedback on this patch. I
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Attachment: SOLR-380-XmlPayload.patch
Functionality is improved. Tests are more complete. I have
[
https://issues.apache.org/jira/browse/SOLR-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-386:
-
Attachment: SOLR-386-SolrHighlighter.patch
SOLR-281 was recently commited. Formerly those changes
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Attachment: lucene-core-2.3-dev.jar
SOLR-380-XmlPayload.patch
This is a draft
[
https://issues.apache.org/jira/browse/SOLR-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-386:
-
Attachment: SOLR-386-SolrHighlighter.patch
Updated patch to work with recent changes made to
[
https://issues.apache.org/jira/browse/SOLR-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-386:
-
Attachment: SOLR-386-SolrHighlighter.patch
This patch allows highlighting to be plugged in.
What
https://issues.apache.org/jira/browse/SOLR-380
> Project: Solr
> Issue Type: New Feature
> Components: search
>Reporter: Tricia Williams
>Priority: Minor
>
> "Paged-Text" FieldType for Solr
> A chance to dig into the guts of Sol
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535748
]
Tricia Williams commented on SOLR-380:
--
The discussion from
http://www.nabble.com/Structured-Lucene-documents
[
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tricia Williams updated SOLR-380:
-
Description:
"Paged-Text" FieldType for Solr
A chance to dig into the guts of Solr. T
apache.org/jira/browse/SOLR-380
Project: Solr
Issue Type: New Feature
Components: search
Reporter: Tricia Williams
Priority: Minor
"Paged-Text" FieldType for Solr
>
> A chance to dig into the guts of Solr. The problem: If
42 matches
Mail list logo