pyyuhao commented on issue #10231: URL: https://github.com/apache/seatunnel/issues/10231#issuecomment-3707883416
> > > Using the ElasticSearch connector directly will automatically recognize OpenSearch > > > > > > Thanks a lot, it do works fine. [@zhangshenghang](https://github.com/zhangshenghang) > > However, I met another problem by using Seatunnel to migrate data from Elasticsearch to OpenSearch/Elasticsearch. > > Since ES/OS' integer storage supports int 、array、 array、 array<array> ..... by define like this: > > "position_int": { "type": "integer", "doc_values": true, "store": true }, > > But Seatunnel only supports array. I met this error which I can't fix by trying all methods I could think of. > > Caused by: java.lang.NumberFormatException: For input string: "[[14,13,13,13,13]]" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.base/java.lang.Integer.parseInt(Integer.java:652) at java.base/java.lang.Integer.parseInt(Integer.java:770) at org.apache.seatunnel.connectors.seatunnel.elasticsearch.serialize.source.DefaultSeaTunnelRowDeserializer.convertValue(DefaultSeaTunnelRowDeserializer.java:163) > > Could you help me fix this? Or is this a bug? Or is this A rule that can't modify? > > You can refer to the array_column parameter in https://seatunnel.apache.org/docs/connector-v2/source/Elasticsearch/ to see if it can solve your problem. @zhangshenghang hi, As I tested, I can confirm that it's a bug or mechanism mismatch In Seatunnel. And it won't happen by using Logstash to transform data from ES to ES/OpenSearch which adapts this feature. Since in ES, all int / array<int> / array<array<int>> will get flattened in Lucene's storage(a mechanism of dynamic type leniency). The fundamental problem is: ES accepts element that is integer, no matter if the element is in which container , while in Seatunnel, it accepts the container defined as int / array<int> / array<array<int>>. As the following picture shows, this case works fine. <img width="969" height="376" alt="Image" src="https://github.com/user-attachments/assets/753fd6d4-0ff4-403c-86c2-3fea72504592" /> However, no matter I configure this as array_column<int> or int, it won't work at all. Do you have a plan to solve this problem by seatunnel's Elasticsearch team? Or Do you accept a pr which I think I can add a adaptor that can slove the Elasticsearch's speical mechanism of dynamic type leniency.( I am a software engineer focusing on ES/OpenSearch) Looking forward to your reply, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
