Move definitively from SVN to Git ?

2014-11-17 Thread Hong-Thai Nguyen
Hi all, Git is implemented everywhere and profit many new features. Should we abandon SVN repo and move to Git forever to facility apply fixes and contribution ? Thanks, -- Hong-Thai

[jira] [Commented] (TIKA-1476) Allow TesseractOCRParser to be configured using an external configuration file

2014-11-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214501#comment-14214501 ] Hudson commented on TIKA-1476: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #317 (See

[jira] [Commented] (TIKA-1447) CHM parser: wrong directory list

2014-11-17 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214535#comment-14214535 ] Hong-Thai Nguyen commented on TIKA-1447: [~binhawking], The work on TIKA-1446 fixed

Re: Move definitively from SVN to Git ?

2014-11-17 Thread Hong-Thai Nguyen
Yes, that's exactly I'm doing. If we move to Git, we'll avoid all SVN stuff. Anyway, this concerns commiters only. On Mon, Nov 17, 2014 at 12:08 PM, Nick Burch apa...@gagravarr.org wrote: On Mon, 17 Nov 2014, Hong-Thai Nguyen wrote: I didn't realize that we could commit/push directly into git

Re: Move definitively from SVN to Git ?

2014-11-17 Thread Nick Burch
On Mon, 17 Nov 2014, Hong-Thai Nguyen wrote: Yes, that's exactly I'm doing. If we move to Git, we'll avoid all SVN stuff. Anyway, this concerns commiters only. If we move to git, people who currently use SVN have to change though! Given that non-committers can already work with Git, could you

[jira] [Commented] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214560#comment-14214560 ] Hudson commented on TIKA-1446: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #318 (See

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-11-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214668#comment-14214668 ] Tim Allison commented on TIKA-1445: --- This might muddy results, initially, but users could

Re: svn commit: r1640017 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java

2014-11-17 Thread Mattmann, Chris A (3980)
+1, agreed, Dave would be nice to have one as a default. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519,

Re: svn commit: r1640017 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java

2014-11-17 Thread Hong-Thai Nguyen
Hi, I've pushed a minor fix to pass this test on Windows. Thanks, On Mon, Nov 17, 2014 at 4:28 PM, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: +1, agreed, Dave would be nice to have one as a default. ++

[jira] [Created] (TIKA-1477) Add customer header to allow overriding of OCR language to be used in Tika Server

2014-11-17 Thread Dave Meikle (JIRA)
Dave Meikle created TIKA-1477: - Summary: Add customer header to allow overriding of OCR language to be used in Tika Server Key: TIKA-1477 URL: https://issues.apache.org/jira/browse/TIKA-1477 Project:

Re: svn commit: r1640017 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java

2014-11-17 Thread David Meikle
On 17 Nov 2014, at 16:32, Hong-Thai Nguyen thaicha...@gmail.com wrote: I've pushed a minor fix to pass this test on Windows. Thanks Hong-Thai, sorry about that! Cheers, Dave

[jira] [Updated] (TIKA-1477) Add custom header to allow overriding of OCR language to be used in Tika Server

2014-11-17 Thread Dave Meikle (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Meikle updated TIKA-1477: -- Summary: Add custom header to allow overriding of OCR language to be used in Tika Server (was: Add

[jira] [Commented] (TIKA-1476) Allow TesseractOCRParser to be configured using an external configuration file

2014-11-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214896#comment-14214896 ] Hudson commented on TIKA-1476: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #319 (See

[jira] [Commented] (TIKA-1476) Allow TesseractOCRParser to be configured using an external configuration file

2014-11-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214913#comment-14214913 ] Hudson commented on TIKA-1476: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #299 (See

[jira] [Created] (TIKA-1478) Build a parser to extract data from .dif format

2014-11-17 Thread Prasanth Iyer (JIRA)
Prasanth Iyer created TIKA-1478: --- Summary: Build a parser to extract data from .dif format Key: TIKA-1478 URL: https://issues.apache.org/jira/browse/TIKA-1478 Project: Tika Issue Type: New

[jira] [Created] (TIKA-1479) Build a parser to extract data from .iso19139 format

2014-11-17 Thread Prasanth Iyer (JIRA)
Prasanth Iyer created TIKA-1479: --- Summary: Build a parser to extract data from .iso19139 format Key: TIKA-1479 URL: https://issues.apache.org/jira/browse/TIKA-1479 Project: Tika Issue Type:

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-11-17 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215170#comment-14215170 ] Luis Filipe Nassif commented on TIKA-1445: -- +1 to respect the order of parsers in

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-11-17 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215292#comment-14215292 ] Nick Burch commented on TIKA-1445: -- +1 to respect the order of parsers in the service

TIKA-1445 and having multiple Parsers (as many as needed) work on the same MediaType

2014-11-17 Thread Mattmann, Chris A (3980)
Hi Guys, There is a great discussion going on around TIKA-1445 right now that I wanted to bring to the dev list: http://issues.apache.org/jira/browse/TIKA-1445 What we are seeing from OCR and GDAL lately is that there may be a use case to have multiple parsers called for the same MediaType. In

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-11-17 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215303#comment-14215303 ] Chris A. Mattmann commented on TIKA-1445: - Hey [~talli...@apache.org]: Here are my