[jira] [Comment Edited] (CONNECTORS-1617) Date format extraction problem in XLS/XLSX
[ https://issues.apache.org/jira/browse/CONNECTORS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036105#comment-17036105 ] Zoltan Farago edited comment on CONNECTORS-1617 at 2/13/20 10:15 AM: - [~kwri...@metacarta.com] what do mean under Tika ticket creation? This ticket is related to all Tika components we found in the JIRA. Should I create a new one? Is it possible to move this issue there? was (Author: zfarago): [~kwri...@metacarta.com] what do mean under Tika ticket creation? This ticket is related to all Tika components we found in the JIRA. Should I create a new one? > Date format extraction problem in XLS/XLSX > -- > > Key: CONNECTORS-1617 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1617 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor, Tika service connector >Affects Versions: ManifoldCF 2.10 >Reporter: Zoltan Farago >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.16 > > Attachments: exceldatum.xlsx > > > Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way: > 2018.05.10 -> 10/05/18 > 2002.02.02 -> 2/2/2 > We need this format: > 2018.05.10 -> 2018-05-10 > 2002.02.02 -> 2002-02-02 > This occurs only when the field type is date. When the field type is text > then the output is fine. > > Please help us with a recommendation with any settings in the pipeline (Tika > configs, excel setting, OS local settings, etc.), or provide a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CONNECTORS-1617) Date format extraction problem in XLS/XLSX
[ https://issues.apache.org/jira/browse/CONNECTORS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036105#comment-17036105 ] Zoltan Farago commented on CONNECTORS-1617: --- [~kwri...@metacarta.com] what do mean under Tika ticket creation? This ticket is related to all Tika components we found in the JIRA. Should I create a new one? > Date format extraction problem in XLS/XLSX > -- > > Key: CONNECTORS-1617 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1617 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor, Tika service connector >Affects Versions: ManifoldCF 2.10 >Reporter: Zoltan Farago >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.16 > > Attachments: exceldatum.xlsx > > > Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way: > 2018.05.10 -> 10/05/18 > 2002.02.02 -> 2/2/2 > We need this format: > 2018.05.10 -> 2018-05-10 > 2002.02.02 -> 2002-02-02 > This occurs only when the field type is date. When the field type is text > then the output is fine. > > Please help us with a recommendation with any settings in the pipeline (Tika > configs, excel setting, OS local settings, etc.), or provide a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CONNECTORS-1617) Date format extraction problem in XLS/XLSX
[ https://issues.apache.org/jira/browse/CONNECTORS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17031416#comment-17031416 ] Zoltan Farago commented on CONNECTORS-1617: --- [~daddywri] could you please help us and assign this task to an active developer? thank you in advance > Date format extraction problem in XLS/XLSX > -- > > Key: CONNECTORS-1617 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1617 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor, Tika service connector >Affects Versions: ManifoldCF 2.10 >Reporter: Zoltan Farago >Priority: Major > Attachments: exceldatum.xlsx > > > Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way: > 2018.05.10 -> 10/05/18 > 2002.02.02 -> 2/2/2 > We need this format: > 2018.05.10 -> 2018-05-10 > 2002.02.02 -> 2002-02-02 > This occurs only when the field type is date. When the field type is text > then the output is fine. > > Please help us with a recommendation with any settings in the pipeline (Tika > configs, excel setting, OS local settings, etc.), or provide a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CONNECTORS-1631) Sharepoint connction problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016848#comment-17016848 ] Zoltan Farago commented on CONNECTORS-1631: --- For the 2019 version please notfy us when the official release is ready. We currently have 2016 test environment, but able to install 2019 version nex month. After we will be able to supply you with files. Our potential customer uses SP Online. I will contact Microsoft if there is a possibility to install the plugin in our cloud test workspace. Are you intested in their answer? If yes I will forward it to you. > Sharepoint connction problem > > > Key: CONNECTORS-1631 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1631 > Project: ManifoldCF > Issue Type: Task >Reporter: Zoltan Farago >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.15 > > Attachments: Manifold connection.png > > > Hello, > We are trying to connct to a Sharepoint 2016 site wich has default > installation. The URL is > [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx] > and from a browser it is fully operational. The site is installed on our > local network, no firewall issues could be. > When we try to connect from the Manifold CF we get this error message: "The > site at > [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents] > did not exist or was external; skipping" > This Manifold installation is able to connect to a Windows share on the same > server, so we think no user/pass Active Directory, etc issues could be here. > We checked forums, documentations but found no solution. > > Is there any special setting needed in Manifold, Sharepoint, et.? > > Thank you in advance! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CONNECTORS-1631) Sharepoint connction problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016836#comment-17016836 ] Zoltan Farago commented on CONNECTORS-1631: --- [~kwri...@metacarta.com] thank you! Our developer intelled, and he is abla to connect now. One more question if you don't mind. Will this plugin work with Sharepoint 2019 and Sharepoint Online as well? If yes SP Online would be tricky to install, is there any detailed step-by-step guide? > Sharepoint connction problem > > > Key: CONNECTORS-1631 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1631 > Project: ManifoldCF > Issue Type: Task >Reporter: Zoltan Farago >Assignee: Karl Wright >Priority: Major > Attachments: Manifold connection.png > > > Hello, > We are trying to connct to a Sharepoint 2016 site wich has default > installation. The URL is > [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx] > and from a browser it is fully operational. The site is installed on our > local network, no firewall issues could be. > When we try to connect from the Manifold CF we get this error message: "The > site at > [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents] > did not exist or was external; skipping" > This Manifold installation is able to connect to a Windows share on the same > server, so we think no user/pass Active Directory, etc issues could be here. > We checked forums, documentations but found no solution. > > Is there any special setting needed in Manifold, Sharepoint, et.? > > Thank you in advance! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CONNECTORS-1631) Sharepoint connction problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Farago updated CONNECTORS-1631: -- Attachment: Manifold connection.png > Sharepoint connction problem > > > Key: CONNECTORS-1631 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1631 > Project: ManifoldCF > Issue Type: Task >Reporter: Zoltan Farago >Priority: Major > Attachments: Manifold connection.png > > > Hello, > We are trying to connct to a Sharepoint 2016 site wich has default > installation. The URL is > [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx] > and from a browser it is fully operational. The site is installed on our > local network, no firewall issues could be. > When we try to connect from the Manifold CF we get this error message: "The > site at > [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents] > did not exist or was external; skipping" > This Manifold installation is able to connect to a Windows share on the same > server, so we think no user/pass Active Directory, etc issues could be here. > We checked forums, documentations but found no solution. > > Is there any special setting needed in Manifold, Sharepoint, et.? > > Thank you in advance! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CONNECTORS-1631) Sharepoint connction problem
Zoltan Farago created CONNECTORS-1631: - Summary: Sharepoint connction problem Key: CONNECTORS-1631 URL: https://issues.apache.org/jira/browse/CONNECTORS-1631 Project: ManifoldCF Issue Type: Task Reporter: Zoltan Farago Attachments: Manifold connection.png Hello, We are trying to connct to a Sharepoint 2016 site wich has default installation. The URL is [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx] and from a browser it is fully operational. The site is installed on our local network, no firewall issues could be. When we try to connect from the Manifold CF we get this error message: "The site at [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents] did not exist or was external; skipping" This Manifold installation is able to connect to a Windows share on the same server, so we think no user/pass Active Directory, etc issues could be here. We checked forums, documentations but found no solution. Is there any special setting needed in Manifold, Sharepoint, et.? Thank you in advance! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CONNECTORS-1617) Date format extraction problem in XLS/XLSX
[ https://issues.apache.org/jira/browse/CONNECTORS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970131#comment-16970131 ] Zoltan Farago edited comment on CONNECTORS-1617 at 11/8/19 12:39 PM: - [~ alexlumpov] could you please take a look on this issue? It' been pending fo a long time, and we need a solution on that. Thank you! was (Author: zfarago): [~ alexlumpov] could you please take a look on this issue? It' been pending fo a lond time, and we need a solution on that. Thank you! > Date format extraction problem in XLS/XLSX > -- > > Key: CONNECTORS-1617 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1617 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor, Tika service connector >Affects Versions: ManifoldCF 2.10 >Reporter: Zoltan Farago >Priority: Major > Attachments: exceldatum.xlsx > > > Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way: > 2018.05.10 -> 10/05/18 > 2002.02.02 -> 2/2/2 > We need this format: > 2018.05.10 -> 2018-05-10 > 2002.02.02 -> 2002-02-02 > This occurs only when the field type is date. When the field type is text > then the output is fine. > > Please help us with a recommendation with any settings in the pipeline (Tika > configs, excel setting, OS local settings, etc.), or provide a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CONNECTORS-1617) Date format extraction problem in XLS/XLSX
[ https://issues.apache.org/jira/browse/CONNECTORS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970131#comment-16970131 ] Zoltan Farago commented on CONNECTORS-1617: --- [~ alexlumpov] could you please take a look on this issue? It' been pending fo a lond time, and we need a solution on that. Thank you! > Date format extraction problem in XLS/XLSX > -- > > Key: CONNECTORS-1617 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1617 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor, Tika service connector >Affects Versions: ManifoldCF 2.10 >Reporter: Zoltan Farago >Priority: Major > Attachments: exceldatum.xlsx > > > Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way: > 2018.05.10 -> 10/05/18 > 2002.02.02 -> 2/2/2 > We need this format: > 2018.05.10 -> 2018-05-10 > 2002.02.02 -> 2002-02-02 > This occurs only when the field type is date. When the field type is text > then the output is fine. > > Please help us with a recommendation with any settings in the pipeline (Tika > configs, excel setting, OS local settings, etc.), or provide a fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CONNECTORS-1617) Date format extraction problem in XLS/XLSX
Zoltan Farago created CONNECTORS-1617: - Summary: Date format extraction problem in XLS/XLSX Key: CONNECTORS-1617 URL: https://issues.apache.org/jira/browse/CONNECTORS-1617 Project: ManifoldCF Issue Type: Task Components: Tika extractor, Tika service connector Affects Versions: ManifoldCF 2.10 Reporter: Zoltan Farago Attachments: exceldatum.xlsx Currently TIKA/ManifoldCF 2.10 extracts dates from the attached file tis way: 2018.05.10 -> 10/05/18 2002.02.02 -> 2/2/2 We need this format: 2018.05.10 -> 2018-05-10 2002.02.02 -> 2002-02-02 This occurs only when the field type is date. When the field type is text then the output is fine. Please help us with a recommendation with any settings in the pipeline (Tika configs, excel setting, OS local settings, etc.), or provide a fix. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790347#comment-16790347 ] Zoltan Farago commented on CONNECTORS-1591: --- [~kwri...@metacarta.com], thank you! Please link the new ticket here, or add me to the watchers list. > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Assignee: Karl Wright >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790321#comment-16790321 ] Zoltan Farago edited comment on CONNECTORS-1591 at 3/12/19 7:40 AM: [~kwri...@metacarta.com] Manifold version is 2.10 an we do not use the mapper attachment. We tried TIka 1.17 and 1.19 both have the same problem. was (Author: zfarago): [~kwri...@metacarta.com] Manifold version is 2.10 an we do not use the mapper attachment. We tried TIka 1.17 and 1.19 both has the same problem. > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790321#comment-16790321 ] Zoltan Farago commented on CONNECTORS-1591: --- [~kwri...@metacarta.com] Manifold version is 2.10 an we do not use the mapper attachment. We tried TIka 1.17 and 1.19 both has the same problem. > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790287#comment-16790287 ] Zoltan Farago edited comment on CONNECTORS-1591 at 3/12/19 6:58 AM: [~kwri...@metacarta.com] the output is an Elastic index. Comments in all other filetypes (.doc, .xls, .pdf, .dcx, .odt, etc) are separated with space from the content text. in RTF files the space is missing. was (Author: zfarago): the output is an Elastic index. Comments in all other filetypes (.doc, .xls, .pdf, .dcx, .odt, etc) are separated with space from the content text. > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790287#comment-16790287 ] Zoltan Farago commented on CONNECTORS-1591: --- the output is an Elastic index. Comments in all other filetypes (.doc, .xls, .pdf, .dcx, .odt, etc) are separated with space from the content text. > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789704#comment-16789704 ] Zoltan Farago commented on CONNECTORS-1591: --- basically, we processeed comment.rtf with manifold using Tika content connector and the result is the result.txt this is the content of the RTF file > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Farago updated CONNECTORS-1591: -- Attachment: (was: result.xml) > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789671#comment-16789671 ] Zoltan Farago commented on CONNECTORS-1591: --- [~kwri...@metacarta.com] you are right, it was a slack issue between a developer and me. now I attached it as .txt fille. Thank you. > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CONNECTORS-1591) RTF comment parsing problem
[ https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Farago updated CONNECTORS-1591: -- Attachment: result.txt > RTF comment parsing problem > --- > > Key: CONNECTORS-1591 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 > Project: ManifoldCF > Issue Type: Bug >Reporter: Zoltan Farago >Priority: Major > Attachments: comment.rtf, result.txt > > > We have a problem with Manifold/Tika. When a comment is parsed from and RTF > file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CONNECTORS-1591) RTF comment parsing problem
Zoltan Farago created CONNECTORS-1591: - Summary: RTF comment parsing problem Key: CONNECTORS-1591 URL: https://issues.apache.org/jira/browse/CONNECTORS-1591 Project: ManifoldCF Issue Type: Bug Reporter: Zoltan Farago Attachments: comment.rtf, result.xml We have a problem with Manifold/Tika. When a comment is parsed from and RTF file, the result has no separator. see attachments -- This message was sent by Atlassian JIRA (v7.6.3#76005)