ong sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https:/
ika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/jira/browse/TIKA-3544
>
ces of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apac
://getcreditcardnumbers.com/] produces
invalid numbers. In JSON and Javascript Numbers are always double precision
floating point.
See [https://www.w3schools.com/js/js_numbers.asp]
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected r
s from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/
. Use strings. Just like
you have to use for US Zipcodes due to leading '0'.
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected r
be wrong 90% of the time... I'm
now inclined to propose that we not do anything here.
Note: This is Excel for Mac (16.52), your mileage may vary.
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected r
for numbers that might
start with leading zeros, like credit card #s, etc. You have to be really
careful to enter them as strings or, better yet, use an actual database.
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected r
uences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache
dit Card Numbers (Source:
http://www.getcreditcardnumbers.com/)
6480195344642780
30295201231669
30082494556063
344850003945824
358338792630
3587385370593640
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the
.
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
>
ing Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/jira/browse/TIKA-3544
>
cell".
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
>
ing Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/jira/browse/TIKA-3544
>
elvetica,Regular"12K00P
http://www.getcreditcardnumbers.com/;>http://www.getcreditcardnumbers.com/
{noformat}
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1.20 doesn’t yiel
egular"12K00P
http://www.getcreditcardnumbers.com/;>http://www.getcreditcardnumbers.com/
{noformat}
> Extraction of long sequences of digits from Excel spreadsheets using Tika
> 1
ets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/jira/browse/TIKA-3544
>
ets using Tika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/jira/browse/TIKA-3544
>
ika
> 1.20 doesn’t yield the expected results
> -
>
> Key: TIKA-3544
> URL: https://issues.apache.org/jira/browse/TIKA-3544
>
Jitin Jindal created TIKA-3544:
--
Summary: Extraction of long sequences of digits from Excel
spreadsheets using Tika 1.20 doesn’t yield the expected results
Key: TIKA-3544
URL: https://issues.apache.org/jira/browse
to site shortly and announce release of 1.21.
> Tika 1.20 suffer from 3 separate CVE vulnerabilities
>
>
> Key: TIKA-2877
> URL: https://issues.apache.org/jira/browse/TIKA-2877
>
/2c027535156cc6862149490b289552d72ba5a9bff985fb7cce794e21@%3Cdev.tika.apache.org%3E
I can add a new table for dependency vulnerabilities on our security page.
Thank you.
> Tika 1.20 suffer from 3 separate CVE vulnerabilit
Pat cashman created TIKA-2877:
-
Summary: Tika 1.20 suffer from 3 separate CVE vulnerabilities
Key: TIKA-2877
URL: https://issues.apache.org/jira/browse/TIKA-2877
Project: Tika
Issue Type: Bug
://lists.apache.org/thread.html/36529c7df113e81ace51301175528120884af73b78edd40764a88cf8@%3Cdev.tika.apache.org%3E
> Can't parse pdf in version 1.20 - Pkcs7Parser (DEF length 465542 object
> truncated by
pdf in version 1.20 - Pkcs7Parser (DEF length 465542 object
> truncated by 465479)
>
>
> Key: TIKA-2869
> URL: https://issues.apache.org/jira/browse/TIKA-2869
>
branch.
I'll take a look. Thank you for opening this issue and sharing a triggering
file!
> Can't parse pdf in version 1.20 - Pkcs7Parser (DEF length 465542 object
> truncated by
-app-1.20.jar, it stopped working.
{{java -jar {color:#ff}tika-app-1.20.jar{color} 0001.127_342_5_7955.pdf}}
{{mai 10, 2019 11:36:23 AM org.apache.tika.config.InitializableProblemHandler$3
handleInitializableProblem}}
{{ADVERT╩NCIA: J2KImageReader not loaded. JPEG2000 files
-app-1.20.jar, it stopped working.
{{java -jar {color:#ff}tika-app-1.20.jar{color} 0001.127_342_5_7955.pdf}}
mai 10, 2019 11:36:23 AM org.apache.tika.config.InitializableProblemHandler$3
handleInitializableProblem
ADVERT╩NCIA: J2KImageReader not loaded. JPEG2000 files will not be processed.
See
Edans Sandes created TIKA-2869:
--
Summary: Can't parse pdf in version 1.20 - Pkcs7Parser (DEF length
465542 object truncated by 465479)
Key: TIKA-2869
URL: https://issues.apache.org/jira/browse/TIKA-2869
[
https://issues.apache.org/jira/browse/TIKA-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2855.
---
Resolution: Duplicate
Thank you!
> pdfbox version used by both Apache Tika 1.19.1 and 1
Abhijit Rajwade created TIKA-2855:
-
Summary: pdfbox version used by both Apache Tika 1.19.1 and 1.20
is vulnerable
Key: TIKA-2855
URL: https://issues.apache.org/jira/browse/TIKA-2855
Project: Tika
t; >> Hi Tim,
> >>
> >> Thanks for rolling the release.
> >>
> >> Built & validated on Mac OS X 10.12
> >>
> >> Updated flink-crawler, all tests pass.
> >>
> >> So here’s my +1
> >>
> >> — Ken
&
The Apache Tika project is pleased to announce the release of Apache Tika
1.20. The release contents have been pushed out to the main Apache
release site and to the Maven Central sync, so the releases should be
available as soon as the mirrors get the syncs.
Apache Tika is a toolkit for detecting
gt; >> So here’s my +1
> >>
> >> — Ken
> >>
> >>
> >> > On Dec 17, 2018, at 6:14 PM, Tim Allison wrote:
> >> >
> >> > A candidate for the Tika 1.20 release is available at:
> >> >
> >
gt; Updated flink-crawler, all tests pass.
>>
>> So here’s my +1
>>
>> — Ken
>>
>>
>> > On Dec 17, 2018, at 6:14 PM, Tim Allison wrote:
>> >
>> > A candidate for the Tika 1.20 release is available at:
>> >
>> > https:/
n Dec 17, 2018, at 6:14 PM, Tim Allison wrote:
> >
> > A candidate for the Tika 1.20 release is available at:
> >
> > https://dist.apache.org/repos/dist/dev/tika/
> >
> > The release candidate is a zip archive of the sources in:
> > h
Hi Tim,
Thanks for rolling the release.
Built & validated on Mac OS X 10.12
Updated flink-crawler, all tests pass.
So here’s my +1
— Ken
> On Dec 17, 2018, at 6:14 PM, Tim Allison wrote:
>
> A candidate for the Tika 1.20 release is available at:
>
> https://dist.ap
we're now suppressing the style markup that our parser
>> > > was (incorrectly, IMHO, inserting) -- check the values in
>> > > "top_10_unique_token_diffs_a", e.g.: rgb: 15 | color: 14 | font: 9 |
>> > > 0,0,0: 4 | background: 4 | 147,147,147: 3 | 247,247,247: 3 |
A candidate for the Tika 1.20 release is available at:
https://dist.apache.org/repos/dist/dev/tika/
The release candidate is a zip archive of the sources in:
https://github.com/apache/tika/tree/1.20-rc1/
The SHA-512 checksum of the archive
>
>> > > I also see that we're losing content in x-java and x-groovy, etc., but
>> > > that's because we're now suppressing the style markup that our parser
>> > > was (incorrectly, IMHO, inserting) -- check the values in
>> > > "top_10_u
ting) -- check the values in
> > > "top_10_unique_token_diffs_a", e.g.: rgb: 15 | color: 14 | font: 9 |
> > > 0,0,0: 4 | background: 4 | 147,147,147: 3 | 247,247,247: 3 | bold: 3 |
> > > weight: 3 | family: 2
> > >
> > > In short, I think we'r
rting) -- check the values in
> > "top_10_unique_token_diffs_a", e.g.: rgb: 15 | color: 14 | font: 9 |
> > 0,0,0: 4 | background: 4 | 147,147,147: 3 | 247,247,247: 3 | bold: 3 |
> > weight: 3 | family: 2
> >
> > In short, I think we're good to go.
0,0,0: 4 | background: 4 | 147,147,147: 3 | 247,247,247: 3 | bold: 3 |
> weight: 3 | family: 2
>
> In short, I think we're good to go. Will roll rc1 later today or
> (more likely) tomorrow unless there are objections.
> On Mon, Dec 10, 2018 at 9:37 PM Tim Allison wrote:
>
Roll forward! Yay!
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Thursday, December 13, 2018 at 7:02 AM
To: "dev@tika.apache.org"
Subject: Re: 1.20?
Reports are here:
http://162.242.228.174/reports/tika_1_20-pre-rc1.zip
I'm going to r
ood to go. Will roll rc1 later today or
(more likely) tomorrow unless there are objections.
On Mon, Dec 10, 2018 at 9:37 PM Tim Allison wrote:
>
> Any blockers on 1.20? I'm going to kick off the regression tests shortly.
> On Fri, Nov 30, 2018 at 7:39 PM wrote:
> >
> >
Any blockers on 1.20? I'm going to kick off the regression tests shortly.
On Fri, Nov 30, 2018 at 7:39 PM wrote:
>
> Hi,
> On Wed, 21 Nov 2018 at 13:00, Tim Allison wrote:
>
> > Dave,
> > Should I try to get the Docker plugin working again?
> >
>
> That wou
Hi,
On Wed, 21 Nov 2018 at 13:00, Tim Allison wrote:
> Dave,
> Should I try to get the Docker plugin working again?
>
That would be great. I think I may have went down the wrong path building
an image at package time, as there doesn't seem to be an easy way to
publish it as an Apache labelled
+1 would be nice to get the recent ENVI work released as well folks.
On 2018/11/20 23:04:29, Tim Allison wrote:
> All,
>POI 4.0.1 will be out shortly with some important bug fixes. What would
> you all think of targeting 1st/2nd week of December for 1.20?
>
> Cheers,
> Tim
>
ay, November 20, 2018 at 3:04 PM
> To: "dev@tika.apache.org"
> Subject: 1.20?
>
>
>
> All,
>
>POI 4.0.1 will be out shortly with some important bug fixes. What would
>
> you all think of targeting 1st/2nd week of December for 1.20?
>
>
>
> Cheers,
>
> Tim
>
>
>
>
Love it and I can align tika-python with that too ☺
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Tuesday, November 20, 2018 at 3:04 PM
To: "dev@tika.apache.org"
Subject: 1.20?
All,
POI 4.0.1 will be out shortly with some important bug fixes.
All,
POI 4.0.1 will be out shortly with some important bug fixes. What would
you all think of targeting 1st/2nd week of December for 1.20?
Cheers,
Tim
51 matches
Mail list logo