[jira] [Created] (TIKA-2099) Tar files without magic bytes are sporadically detected as text

2016-09-27 Thread Robin Schimpf (JIRA)
Robin Schimpf created TIKA-2099:
---

 Summary: Tar files without magic bytes are sporadically detected 
as text
 Key: TIKA-2099
 URL: https://issues.apache.org/jira/browse/TIKA-2099
 Project: Tika
  Issue Type: Bug
Affects Versions: 1.11
Reporter: Robin Schimpf


When a tar is created with 7 Zip 9.20 the magic bytes "ustar" are not added. 
Everything seems to work file if the tar contains Microsoft Office files. But 
when only text files are contained Tika sporadically recognices it as 
text/plain. It also seems to depend on the size of the first file in the tar. 
This has to be several KB big.
The problem was found in version 1.11 and also exists in the latest 
1.14-SNAPSHOT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2099) Tar files without magic bytes are sporadically detected as text

2016-09-29 Thread Robin Schimpf (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535154#comment-15535154
 ] 

Robin Schimpf commented on TIKA-2099:
-

It seems like the ZipContainerDetector gets never called in 
{{tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java}}. I will 
place the test in 
{{tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java}}.
I felt save to remove the special handling because of the changes made in 
COMPRESS-331.

> Tar files without magic bytes are sporadically detected as text
> ---
>
> Key: TIKA-2099
> URL: https://issues.apache.org/jira/browse/TIKA-2099
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.11
>Reporter: Robin Schimpf
>
> When a tar is created with 7 Zip 9.20 the magic bytes "ustar" are not added. 
> Everything seems to work file if the tar contains Microsoft Office files. But 
> when only text files are contained Tika sporadically recognices it as 
> text/plain. It also seems to depend on the size of the first file in the tar. 
> This has to be several KB big.
> The problem was found in version 1.11 and also exists in the latest 
> 1.14-SNAPSHOT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2099) Tar files without magic bytes are sporadically detected as text

2016-12-11 Thread Robin Schimpf (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15741257#comment-15741257
 ] 

Robin Schimpf commented on TIKA-2099:
-

Any updates regarding this error?

> Tar files without magic bytes are sporadically detected as text
> ---
>
> Key: TIKA-2099
> URL: https://issues.apache.org/jira/browse/TIKA-2099
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.11
>Reporter: Robin Schimpf
>
> When a tar is created with 7 Zip 9.20 the magic bytes "ustar" are not added. 
> Everything seems to work file if the tar contains Microsoft Office files. But 
> when only text files are contained Tika sporadically recognices it as 
> text/plain. It also seems to depend on the size of the first file in the tar. 
> This has to be several KB big.
> The problem was found in version 1.11 and also exists in the latest 
> 1.14-SNAPSHOT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-2446) Tainted Zip file can provoke OOM errors

2018-06-05 Thread Robin Schimpf (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502296#comment-16502296
 ] 

Robin Schimpf commented on TIKA-2446:
-

I also see this error from time to time with Tika 1.15. This is a regression 
from the changes made in TIKA-2311. [~talli...@apache.org] since you made the 
changes for the truncated zips can you provide any help or information? I tried 
to fix the issue but always broke some existing tests.

This might also be forwarded to the POI project. Maybe they can provide a fix 
in their code.

> Tainted Zip file can provoke OOM errors
> ---
>
> Key: TIKA-2446
> URL: https://issues.apache.org/jira/browse/TIKA-2446
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.16
>Reporter: Thorsten Schäfer
>Priority: Major
> Attachments: corrupt_zip.zip
>
>
> Hi,
> using Tika 1.16 with embedded POI 3.17-beta1 we experienced an OutOfMemory 
> error on a Zip file. The suspicious code is in the constructor of 
> FakeZipEntry in line 125. Here a ByteArrayOutputStream of up to 2 GiB in size 
> is opened which will most probably lead to an OutOfMemory. The entry size in 
> the zip file can be easily faked by an attacker.
> The code path to FakeZipEntry will be used only if the native 
> java.util.zip.ZipFile implementation already failed to open the (possibly 
> corrupted) Zip. Possibly a more fine grained error analysis could be done in 
> ZipPackage.
> I have attached a tweaked zip file that will provoke this error.
> {code:java}
> public FakeZipEntry(ZipEntry entry, InputStream inp) throws IOException {
>   super(entry.getName());
>   
>   // Grab the de-compressed contents for later
> ByteArrayOutputStream baos;
> long entrySize = entry.getSize();
> if (entrySize !=-1) {
> if (entrySize>=Integer.MAX_VALUE) {
> throw new IOException("ZIP entry size is too large");
> }
> baos = new ByteArrayOutputStream((int) entrySize);
> } else {
>   baos = new ByteArrayOutputStream();
> }
> {code}
> Kinds,
> Thorsten



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-2446) Tainted Zip file can provoke OOM errors

2018-06-06 Thread Robin Schimpf (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504245#comment-16504245
 ] 

Robin Schimpf commented on TIKA-2446:
-

Thank you for the fix!

Looks like POI changed their implementation to use commons-compress recently. 
Good to know they already fixed the problem.

> Tainted Zip file can provoke OOM errors
> ---
>
> Key: TIKA-2446
> URL: https://issues.apache.org/jira/browse/TIKA-2446
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.16
>Reporter: Thorsten Schäfer
>Priority: Major
> Fix For: 1.19, 2.0.0
>
> Attachments: corrupt_zip.zip
>
>
> Hi,
> using Tika 1.16 with embedded POI 3.17-beta1 we experienced an OutOfMemory 
> error on a Zip file. The suspicious code is in the constructor of 
> FakeZipEntry in line 125. Here a ByteArrayOutputStream of up to 2 GiB in size 
> is opened which will most probably lead to an OutOfMemory. The entry size in 
> the zip file can be easily faked by an attacker.
> The code path to FakeZipEntry will be used only if the native 
> java.util.zip.ZipFile implementation already failed to open the (possibly 
> corrupted) Zip. Possibly a more fine grained error analysis could be done in 
> ZipPackage.
> I have attached a tweaked zip file that will provoke this error.
> {code:java}
> public FakeZipEntry(ZipEntry entry, InputStream inp) throws IOException {
>   super(entry.getName());
>   
>   // Grab the de-compressed contents for later
> ByteArrayOutputStream baos;
> long entrySize = entry.getSize();
> if (entrySize !=-1) {
> if (entrySize>=Integer.MAX_VALUE) {
> throw new IOException("ZIP entry size is too large");
> }
> baos = new ByteArrayOutputStream((int) entrySize);
> } else {
>   baos = new ByteArrayOutputStream();
> }
> {code}
> Kinds,
> Thorsten



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TIKA-3550) Some DXF files are detected as text/plain

2021-09-12 Thread Robin Schimpf (Jira)
Robin Schimpf created TIKA-3550:
---

 Summary: Some DXF files are detected as text/plain
 Key: TIKA-3550
 URL: https://issues.apache.org/jira/browse/TIKA-3550
 Project: Tika
  Issue Type: Bug
Affects Versions: 2.1.0, 1.27
Reporter: Robin Schimpf


I noticed Tika fails to detect the fileformat of the files from 
[https://people.math.sc.edu/Burkardt/data/dxf/dxf.html]

Contrary to the testfile included (where the test is currently disabled on 2.x) 
those files have 2 spaces before the numbers. The comment in the 
tika-mimetypes.xml suggests for me that this should work. Would be nice if the 
detection would work from no space to any number of spaces before the number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TIKA-3550) Some DXF files are detected as text/plain

2021-09-12 Thread Robin Schimpf (Jira)


 [ 
https://issues.apache.org/jira/browse/TIKA-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Schimpf updated TIKA-3550:

Attachment: Cube FreeCAD.dxf

> Some DXF files are detected as text/plain
> -
>
> Key: TIKA-3550
> URL: https://issues.apache.org/jira/browse/TIKA-3550
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.27, 2.1.0
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: Cube FreeCAD.dxf
>
>
> I noticed Tika fails to detect the fileformat of the files from 
> [https://people.math.sc.edu/Burkardt/data/dxf/dxf.html]
> Contrary to the testfile included (where the test is currently disabled on 
> 2.x) those files have 2 spaces before the numbers. The comment in the 
> tika-mimetypes.xml suggests for me that this should work. Would be nice if 
> the detection would work from no space to any number of spaces before the 
> number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TIKA-3550) Some DXF files are detected as text/plain

2021-09-12 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413764#comment-17413764
 ] 

Robin Schimpf commented on TIKA-3550:
-

Found that FreeCAD is able to export DXF files. Created a simple cube and 
exportet it. This files differs from the linked files as the software used to 
create the file is the first entry there.

> Some DXF files are detected as text/plain
> -
>
> Key: TIKA-3550
> URL: https://issues.apache.org/jira/browse/TIKA-3550
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.27, 2.1.0
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: Cube FreeCAD.dxf
>
>
> I noticed Tika fails to detect the fileformat of the files from 
> [https://people.math.sc.edu/Burkardt/data/dxf/dxf.html]
> Contrary to the testfile included (where the test is currently disabled on 
> 2.x) those files have 2 spaces before the numbers. The comment in the 
> tika-mimetypes.xml suggests for me that this should work. Would be nice if 
> the detection would work from no space to any number of spaces before the 
> number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (TIKA-3550) Some DXF files are detected as text/plain

2021-09-12 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413764#comment-17413764
 ] 

Robin Schimpf edited comment on TIKA-3550 at 9/12/21, 7:32 PM:
---

Found that FreeCAD is able to export DXF files. Created a simple cube and 
exported it. This files differs from the linked files as the software used to 
create the file is the first entry there.


was (Author: rschimpf):
Found that FreeCAD is able to export DXF files. Created a simple cube and 
exportet it. This files differs from the linked files as the software used to 
create the file is the first entry there.

> Some DXF files are detected as text/plain
> -
>
> Key: TIKA-3550
> URL: https://issues.apache.org/jira/browse/TIKA-3550
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.27, 2.1.0
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: Cube FreeCAD.dxf
>
>
> I noticed Tika fails to detect the fileformat of the files from 
> [https://people.math.sc.edu/Burkardt/data/dxf/dxf.html]
> Contrary to the testfile included (where the test is currently disabled on 
> 2.x) those files have 2 spaces before the numbers. The comment in the 
> tika-mimetypes.xml suggests for me that this should work. Would be nice if 
> the detection would work from no space to any number of spaces before the 
> number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TIKA-3550) Some DXF files are detected as text/plain

2021-09-15 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415668#comment-17415668
 ] 

Robin Schimpf commented on TIKA-3550:
-

Thank you for fixing the problem so quick!

> Some DXF files are detected as text/plain
> -
>
> Key: TIKA-3550
> URL: https://issues.apache.org/jira/browse/TIKA-3550
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 1.27, 2.1.0
>Reporter: Robin Schimpf
>Assignee: Tim Allison
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: Cube FreeCAD.dxf
>
>
> I noticed Tika fails to detect the fileformat of the files from 
> [https://people.math.sc.edu/Burkardt/data/dxf/dxf.html]
> Contrary to the testfile included (where the test is currently disabled on 
> 2.x) those files have 2 spaces before the numbers. The comment in the 
> tika-mimetypes.xml suggests for me that this should work. Would be nice if 
> the detection would work from no space to any number of spaces before the 
> number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (TIKA-3859) Wrong filename glob for Zstandard

2022-09-20 Thread Robin Schimpf (Jira)
Robin Schimpf created TIKA-3859:
---

 Summary: Wrong filename glob for Zstandard
 Key: TIKA-3859
 URL: https://issues.apache.org/jira/browse/TIKA-3859
 Project: Tika
  Issue Type: Bug
Affects Versions: 2.4.1
Reporter: Robin Schimpf


I'm currently implementing Zstandard support in my application. When checking 
the tika-mimetypes.xml definition of the format I noticed that the glob is 
defined as *.zstd and the zstd binary always adds only *.zst as file extension. 
Checking the final RFC 8478 at  revealed that the glob is wrong and should be 
changed to *.zst.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (TIKA-3859) Wrong filename glob for Zstandard

2022-09-20 Thread Robin Schimpf (Jira)


 [ 
https://issues.apache.org/jira/browse/TIKA-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Schimpf updated TIKA-3859:

Description: I'm currently implementing Zstandard support in my 
application. When checking the tika-mimetypes.xml definition of the format I 
noticed that the glob is defined as *.zstd and the zstd binary always adds only 
*.zst as file extension. Checking the final RFC 8478 revealed that the glob is 
wrong and should be changed to *.zst.  (was: I'm currently implementing 
Zstandard support in my application. When checking the tika-mimetypes.xml 
definition of the format I noticed that the glob is defined as *.zstd and the 
zstd binary always adds only *.zst as file extension. Checking the final RFC 
8478 at  revealed that the glob is wrong and should be changed to *.zst.)

> Wrong filename glob for Zstandard
> -
>
> Key: TIKA-3859
> URL: https://issues.apache.org/jira/browse/TIKA-3859
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Robin Schimpf
>Priority: Major
>
> I'm currently implementing Zstandard support in my application. When checking 
> the tika-mimetypes.xml definition of the format I noticed that the glob is 
> defined as *.zstd and the zstd binary always adds only *.zst as file 
> extension. Checking the final RFC 8478 revealed that the glob is wrong and 
> should be changed to *.zst.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-3859) Wrong filename glob for Zstandard

2022-09-21 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607633#comment-17607633
 ] 

Robin Schimpf commented on TIKA-3859:
-

Thanks for the quick fix!

> Wrong filename glob for Zstandard
> -
>
> Key: TIKA-3859
> URL: https://issues.apache.org/jira/browse/TIKA-3859
> Project: Tika
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Robin Schimpf
>Assignee: Tim Allison
>Priority: Major
> Fix For: 2.5.0
>
>
> I'm currently implementing Zstandard support in my application. When checking 
> the tika-mimetypes.xml definition of the format I noticed that the glob is 
> defined as *.zstd and the zstd binary always adds only *.zst as file 
> extension. Checking the final RFC 8478 revealed that the glob is wrong and 
> should be changed to *.zst.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TIKA-4222) Add detection for OpenSCAD

2024-03-24 Thread Robin Schimpf (Jira)
Robin Schimpf created TIKA-4222:
---

 Summary: Add detection for OpenSCAD
 Key: TIKA-4222
 URL: https://issues.apache.org/jira/browse/TIKA-4222
 Project: Tika
  Issue Type: Improvement
Reporter: Robin Schimpf


OpenSCAD (https://openscad.org/index.html) is a 3D modeller based on a custom 
script language. The files are currently detected as text/plain.

 

 

Examples can be found here: 
https://github.com/openscad/openscad/tree/master/examples



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TIKA-4223) STL file exported with OpenSCAD not detected correctly

2024-03-24 Thread Robin Schimpf (Jira)
Robin Schimpf created TIKA-4223:
---

 Summary: STL file exported with OpenSCAD not detected correctly
 Key: TIKA-4223
 URL: https://issues.apache.org/jira/browse/TIKA-4223
 Project: Tika
  Issue Type: Bug
Affects Versions: 2.9.1
Reporter: Robin Schimpf
 Attachments: linear_extrude_ascii.stl, linear_extrude_binary.stl

STL files can be in ASCII or in binary format. Exporting this file 
([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
 with OpenSCAD into STL the ASCII result file is detected as text/plain.

Also the binary STL is detected with application/vnd.ms-pki.stl which differs 
from the model/stl mime-type Wikipedia lists for those files.

 

Used commands for attached files
{code:java}
openscad.exe --export-format asciistl -o result\linear_extrude_ascii.stl 
examples\Basics\linear_extrude.scad {code}
{code:java}
openscad.exe --export-format binstl -o result\linear_extrude_binary.stl 
examples\Basics\linear_extrude.scad
{code}
Refs:

https://en.wikipedia.org/wiki/STL_(file_format)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TIKA-4224) Add detection for 3MF

2024-03-24 Thread Robin Schimpf (Jira)
Robin Schimpf created TIKA-4224:
---

 Summary: Add detection for 3MF
 Key: TIKA-4224
 URL: https://issues.apache.org/jira/browse/TIKA-4224
 Project: Tika
  Issue Type: Improvement
Reporter: Robin Schimpf
 Attachments: linear_extrude.3mf

3MF is an alternative format to STL for 3D models. Exporting this file 
([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
 with OpenSCAD into 3MF the result file is detected as application/zip.

 

Export command
{code:java}
openscad.exe -o result\linear_extrude.3mf examples\Basics\linear_extrude.scad 
{code}
Refs:

[https://en.wikipedia.org/wiki/3D_Manufacturing_Format]

[https://3mf.io/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (TIKA-4225) Add detection for AMF

2024-03-24 Thread Robin Schimpf (Jira)
Robin Schimpf created TIKA-4225:
---

 Summary: Add detection for AMF
 Key: TIKA-4225
 URL: https://issues.apache.org/jira/browse/TIKA-4225
 Project: Tika
  Issue Type: Improvement
Reporter: Robin Schimpf
 Attachments: linear_extrude.amf

AMF is an alternative format to STL for 3D models. Exporting this file 
([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
 with OpenSCAD into AMF the result file is detected as application/xml.

 

Export command
{code:java}
openscad.exe -o result\linear_extrude.amf examples\Basics\linear_extrude.scad 
{code}
Refs:

[https://en.wikipedia.org/wiki/Additive_manufacturing_file_format]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4222) Add detection for OpenSCAD

2024-03-25 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830605#comment-17830605
 ] 

Robin Schimpf commented on TIKA-4222:
-

Yes I think the only way to detect it is via the file extension

> Add detection for OpenSCAD
> --
>
> Key: TIKA-4222
> URL: https://issues.apache.org/jira/browse/TIKA-4222
> Project: Tika
>  Issue Type: Improvement
>Reporter: Robin Schimpf
>Priority: Major
>
> OpenSCAD (https://openscad.org/index.html) is a 3D modeller based on a custom 
> script language. The files are currently detected as text/plain.
>  
>  
> Examples can be found here: 
> https://github.com/openscad/openscad/tree/master/examples



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4223) STL file exported with OpenSCAD not detected correctly

2024-03-25 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830607#comment-17830607
 ] 

Robin Schimpf commented on TIKA-4223:
-

If I understand the Wikipedia article correct the ASCII file has to start with 
"solid". The text afterwards is the model name. So this would be flexible.

Also the "OpenSCAD Model" in the binary file seems to be the model name. 
Wikipedia mentions a header of 80 bytes but there seems to be no magic bytes 
present for detection. So maybe the only way would be the file ending?

> STL file exported with OpenSCAD not detected correctly
> --
>
> Key: TIKA-4223
> URL: https://issues.apache.org/jira/browse/TIKA-4223
> Project: Tika
>  Issue Type: Improvement
>Affects Versions: 2.9.1
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: linear_extrude_ascii.stl, linear_extrude_binary.stl
>
>
> STL files can be in ASCII or in binary format. Exporting this file 
> ([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
>  with OpenSCAD into STL the ASCII result file is detected as text/plain.
> Also the binary STL is detected with application/vnd.ms-pki.stl which differs 
> from the model/stl mime-type Wikipedia lists for those files.
>  
> Used commands for attached files
> {code:java}
> openscad.exe --export-format asciistl -o result\linear_extrude_ascii.stl 
> examples\Basics\linear_extrude.scad {code}
> {code:java}
> openscad.exe --export-format binstl -o result\linear_extrude_binary.stl 
> examples\Basics\linear_extrude.scad
> {code}
> Refs:
> https://en.wikipedia.org/wiki/STL_(file_format)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4224) Add detection for 3MF

2024-03-25 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830632#comment-17830632
 ] 

Robin Schimpf commented on TIKA-4224:
-

Reading the spec at 
[https://github.com/3MFConsortium/spec_core/blob/master/3MF%20Core%20Specification.md]
 there is no mention of the [ContentTypes].xml file. From the recommendation at 
[https://github.com/3MFConsortium/spec_core/blob/master/3MF%20Core%20Specification.md#22-part-naming-recommendations]
 the /3D/3dModel.model file should be checked which is an xml file.

> Add detection for 3MF
> -
>
> Key: TIKA-4224
> URL: https://issues.apache.org/jira/browse/TIKA-4224
> Project: Tika
>  Issue Type: Improvement
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: linear_extrude.3mf
>
>
> 3MF is an alternative format to STL for 3D models. Exporting this file 
> ([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
>  with OpenSCAD into 3MF the result file is detected as application/zip.
>  
> Export command
> {code:java}
> openscad.exe -o result\linear_extrude.3mf examples\Basics\linear_extrude.scad 
> {code}
> Refs:
> [https://en.wikipedia.org/wiki/3D_Manufacturing_Format]
> [https://3mf.io/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4224) Add detection for 3MF

2024-03-25 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830664#comment-17830664
 ] 

Robin Schimpf commented on TIKA-4224:
-

Ah ok. Skipped the OPC part. Mime Type is fine for me.

> Add detection for 3MF
> -
>
> Key: TIKA-4224
> URL: https://issues.apache.org/jira/browse/TIKA-4224
> Project: Tika
>  Issue Type: Improvement
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: linear_extrude.3mf
>
>
> 3MF is an alternative format to STL for 3D models. Exporting this file 
> ([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
>  with OpenSCAD into 3MF the result file is detected as application/zip.
>  
> Export command
> {code:java}
> openscad.exe -o result\linear_extrude.3mf examples\Basics\linear_extrude.scad 
> {code}
> Refs:
> [https://en.wikipedia.org/wiki/3D_Manufacturing_Format]
> [https://3mf.io/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (TIKA-4223) STL file exported with OpenSCAD not detected correctly

2024-03-25 Thread Robin Schimpf (Jira)


[ 
https://issues.apache.org/jira/browse/TIKA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830707#comment-17830707
 ] 

Robin Schimpf commented on TIKA-4223:
-

application/vnd.ms-pki.stl might just be an alias (or older mime type) for the 
binary STL format. Found this site 
([https://www.westaflex.com/support/dokumente/Dichtung)] where the file is 
listed with the mime type. Downloading and inspecting it it is the binary STL 
format.

> STL file exported with OpenSCAD not detected correctly
> --
>
> Key: TIKA-4223
> URL: https://issues.apache.org/jira/browse/TIKA-4223
> Project: Tika
>  Issue Type: Improvement
>Affects Versions: 2.9.1
>Reporter: Robin Schimpf
>Priority: Major
> Attachments: linear_extrude_ascii.stl, linear_extrude_binary.stl
>
>
> STL files can be in ASCII or in binary format. Exporting this file 
> ([https://github.com/openscad/openscad/blob/master/examples/Basics/linear_extrude.scad)]
>  with OpenSCAD into STL the ASCII result file is detected as text/plain.
> Also the binary STL is detected with application/vnd.ms-pki.stl which differs 
> from the model/stl mime-type Wikipedia lists for those files.
>  
> Used commands for attached files
> {code:java}
> openscad.exe --export-format asciistl -o result\linear_extrude_ascii.stl 
> examples\Basics\linear_extrude.scad {code}
> {code:java}
> openscad.exe --export-format binstl -o result\linear_extrude_binary.stl 
> examples\Basics\linear_extrude.scad
> {code}
> Refs:
> https://en.wikipedia.org/wiki/STL_(file_format)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)