Woops, thank you for moving this to the right mailing list Oleg!
Mike McCandless
http://blog.mikemccandless.com
On Thu, Mar 6, 2014 at 12:56 AM, Oleg Tikhonov o...@apache.org wrote:
Hi Mike!
Sounds great! Thanks.
Oleg
On Wed, Mar 5, 2014 at 6:47 PM, Michael McCandless
[
https://issues.apache.org/jira/browse/TIKA-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922321#comment-13922321
]
Alexandre Madurell edited comment on TIKA-1252 at 3/6/14 11:10 AM:
On Thu, 6 Mar 2014, Hong-Thai Nguyen wrote:
Guava (https://code.google.com/p/guava-libraries/) provides many
facilities on text, file, collection ... manipuation. Should we use in
Tika ?
Can you give an example of where using Guava would either simplify some
existing code, or improve its
Hong-Thai Nguyen created TIKA-1257:
--
Summary: MS Word Filter out control characters on ouput
Key: TIKA-1257
URL: https://issues.apache.org/jira/browse/TIKA-1257
Project: Tika
Issue Type:
[
https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong-Thai Nguyen updated TIKA-1257:
---
Attachment: tika-doc-control-char.png
5f01ae23-9e6e-4faa-808a-f78dbb20cc71.doc
[
https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong-Thai Nguyen resolved TIKA-1257.
Resolution: Fixed
Fixed on r1574874
MS Word Filter out control characters on ouput
The Buildbot has detected a new failure on builder tika-trunk while building
ASF Buildbot.
Full details are available at:
http://ci.apache.org/builders/tika-trunk/builds/1169
Buildbot URL: http://ci.apache.org/
Buildslave for this Build: portunus_ubuntu
Build Reason: scheduler
Build Source
Hi,
Anyone can create branch remotes/origin/1.5 on git ?
Thanks
Hong-Thai
-Message d'origine-
De : David Meikle [mailto:loo...@gmail.com] De la part de David Meikle
Envoyé : mercredi 19 février 2014 23:19
À : annou...@apache.org
Cc : dev@tika.apache.org; u...@tika.apache.org
Objet :
The Buildbot has detected a restored build on builder tika-trunk while building
ASF Buildbot.
Full details are available at:
http://ci.apache.org/builders/tika-trunk/builds/1170
Buildbot URL: http://ci.apache.org/
Buildslave for this Build: portunus_ubuntu
Build Reason: scheduler
Build Source
[
https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922490#comment-13922490
]
Hong-Thai Nguyen edited comment on TIKA-1257 at 3/6/14 1:50 PM:
Hi,
On Thu, Mar 6, 2014 at 8:27 AM, Hong-Thai Nguyen
hong-thai.ngu...@polyspot.com wrote:
Anyone can create branch remotes/origin/1.5 on git ?
Do we need a 1.5 branch?
BR,
Jukka Zitting
I guess that users could maintain hotfixes basing on a released branch in
attending next release. We have already branches for old releases:
hong-thai.nguyen@HTN-PC /c/git/tika (trunk)
$ git branch -a
* trunk
remotes/origin/0.1-incubating
remotes/origin/0.10
remotes/origin/0.2
Hi,
On Thu, Mar 6, 2014 at 10:14 AM, Hong-Thai Nguyen
hong-thai.ngu...@polyspot.com wrote:
I guess that users could maintain hotfixes basing on a released branch in
attending next release.
Right, at least there's no harm in having the branch, so I just
created it in revision 1574919.
BR,
If you will bring it as a dependency -- don't use guava 15, use guava 16.
It breaks CDI in major appservers (jboss as 7, glassfish 3, websphere) with
incorrect beans.xml.
See https://issues.jboss.org/browse/WELD-1007 and
https://code.google.com/p/guava-libraries/issues/detail?id=1527.
--
Best
Hi,
On Thu, Mar 6, 2014 at 6:54 AM, Nick Burch apa...@gagravarr.org wrote:
On Thu, 6 Mar 2014, Hong-Thai Nguyen wrote:
Guava (https://code.google.com/p/guava-libraries/) provides many
facilities on text, file, collection ... manipuation. Should we use in Tika?
Can you give an example of
Thank for feedback.
Nothing we can't do with our code :) Guava is just 'facilities' make code more
clear, shorter and sometime faster.
I agree that this integration brings more dependencies, may create conflicts in
end-users applications. Leave as it for now.
Cheers,
Hong-Thai
-Message
[
https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong-Thai Nguyen updated TIKA-1257:
---
Attachment: (was: 5f01ae23-9e6e-4faa-808a-f78dbb20cc71.doc)
MS Word Filter out control
[
https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hong-Thai Nguyen updated TIKA-1257:
---
Attachment: testControlCharacters.doc
MS Word Filter out control characters on ouput
[
https://issues.apache.org/jira/browse/TIKA-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922736#comment-13922736
]
Tim Allison commented on TIKA-1232:
---
Fixed r1574959. Reopen if any tweaks remain to me
[
https://issues.apache.org/jira/browse/TIKA-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1232.
---
Resolution: Fixed
r1574959
Add PDF version to PDFParser output
---
Hong-Thai,
Thank you for running these tests. I suspect (mea culpa) that the increase
in PDF runtime exception failures was caused by PDFBOX-1803/TIKA-1233, which
was not fixed before 1.5 was cut.
I recently made major modifications to the metadata extraction components of
the PDFParser
[
https://issues.apache.org/jira/browse/TIKA-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-1252.
---
Resolution: Fixed
Fixed as of r1574964. Thank you, Alexandre, for raising this issue and for
[
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922890#comment-13922890
]
Luis Filipe Nassif commented on TIKA-623:
-
Good job. I think a possible improvement
[
https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922921#comment-13922921
]
Tim Allison commented on TIKA-623:
--
Agreed. Is there any way to reuse OutlookParser or to
Hi, folks.
Tika-core is quite pure (uses only java.util.logging) but tika-parsers uses
commons-logging 1.1.1 (through pdfbox), slf4j-api 1.5.6 (through netcdf)
and log4j 1.2.14 (through slf4j-log4j as test scope dependency). Also some
parsers (like pdfbox) logs just to stdout/stderr.
It's
[
https://issues.apache.org/jira/browse/TIKA-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923019#comment-13923019
]
Tim Allison commented on TIKA-1252:
---
[~alexandre.madur...@gmail.com], before opening an
[
https://issues.apache.org/jira/browse/TIKA-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923399#comment-13923399
]
Tim Allison commented on TIKA-1252:
---
Not immediately obvious to me how to use xmpbox with
On Fri, 7 Mar 2014, Konstantin Gribov wrote:
Tika-core is quite pure (uses only java.util.logging) but tika-parsers
uses commons-logging 1.1.1 (through pdfbox), slf4j-api 1.5.6 (through
netcdf) and log4j 1.2.14 (through slf4j-log4j as test scope dependency).
Also some parsers (like pdfbox)
28 matches
Mail list logo