[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Attachment: BaysianTest.java
Simple demo program for the MIME type probability detection
MIME type
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Summary: MIME type detection with probability (was: MIME type selection
with probability)
MIME type
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
Shuai Liu created TIKA-1517:
---
Summary: MIME type selection with probability
Key: TIKA-1517
URL: https://issues.apache.org/jira/browse/TIKA-1517
Project: Tika
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Comment: was deleted
(was: Proposed design:
The idea of selection is to incorporate probability as weights
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278190#comment-14278190
]
Shuai Liu commented on TIKA-1517:
-
Proposed design:
The idea of selection is to incorporate
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1517:
Description:
Problem and intuition
The original implementation in MIME type determination is a bit less
[
https://issues.apache.org/jira/browse/TIKA-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161285#comment-14161285
]
Shuai Liu commented on TIKA-1437:
-
Thanks Tim, but i imbedded my response below, i hope
[
https://issues.apache.org/jira/browse/TIKA-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161285#comment-14161285
]
Shuai Liu edited comment on TIKA-1437 at 10/7/14 12:44 AM:
---
[
https://issues.apache.org/jira/browse/TIKA-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161285#comment-14161285
]
Shuai Liu edited comment on TIKA-1437 at 10/7/14 12:44 AM:
---
Shuai Liu created TIKA-1437:
---
Summary: encoding issue in AutoDetectReader
Key: TIKA-1437
URL: https://issues.apache.org/jira/browse/TIKA-1437
Project: Tika
Issue Type: Bug
Components:
[
https://issues.apache.org/jira/browse/TIKA-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1437:
Attachment: computrabajo-ar-20121108.tsv
The problem tsv file with which we are having the encoding problem.
[
https://issues.apache.org/jira/browse/TIKA-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1437:
Description:
We are having an encoding problem with Tika AutoDetectReader;
we are using AutoDetectReader to
[
https://issues.apache.org/jira/browse/TIKA-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Liu updated TIKA-1437:
Attachment: ef.jpg
e9.jpg
the e9.jpg is a screenshot of the raw tsv file; you can see the
20 matches
Mail list logo