pyyuhao commented on issue #10231:
URL: https://github.com/apache/seatunnel/issues/10231#issuecomment-3707883416

   > > > Using the ElasticSearch connector directly will automatically 
recognize OpenSearch
   > > 
   > > 
   > > Thanks a lot, it do works fine. 
[@zhangshenghang](https://github.com/zhangshenghang)
   > > However, I met another problem by using Seatunnel to migrate data from 
Elasticsearch to OpenSearch/Elasticsearch.
   > > Since ES/OS' integer storage supports int 、array、 array、 array<array> 
..... by define like this:
   > > "position_int": { "type": "integer", "doc_values": true, "store": true },
   > > But Seatunnel only supports array. I met this error which I can't fix by 
trying all methods I could think of.
   > > Caused by: java.lang.NumberFormatException: For input string: 
"[[14,13,13,13,13]]" at 
java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 at java.base/java.lang.Integer.parseInt(Integer.java:652) at 
java.base/java.lang.Integer.parseInt(Integer.java:770) at 
org.apache.seatunnel.connectors.seatunnel.elasticsearch.serialize.source.DefaultSeaTunnelRowDeserializer.convertValue(DefaultSeaTunnelRowDeserializer.java:163)
   > > Could you help me fix this? Or is this a bug? Or is this A rule that 
can't modify?
   > 
   > You can refer to the array_column parameter in 
https://seatunnel.apache.org/docs/connector-v2/source/Elasticsearch/ to see if 
it can solve your problem.
   
   @zhangshenghang hi,
   As I tested,  I can confirm that it's a bug or mechanism mismatch In 
Seatunnel.  And it won't happen by using Logstash to transform data from ES to 
ES/OpenSearch which adapts this feature.
   
   Since in ES, all int / array<int> / array<array<int>> will get flattened in 
Lucene's storage(a mechanism of dynamic type leniency). The fundamental problem 
is: ES accepts element that is integer, no matter if the element is in which 
container , while in Seatunnel, it accepts the container defined as int / 
array<int> / array<array<int>>.
   As the following picture shows, this case works fine.
   
   <img width="969" height="376" alt="Image" 
src="https://github.com/user-attachments/assets/753fd6d4-0ff4-403c-86c2-3fea72504592";
 />
   
   However, no matter I configure this as array_column<int> or int, it won't 
work at all.
   
   Do you have a plan to solve this problem by seatunnel's Elasticsearch team?
   Or Do you accept a pr which I think I can add a adaptor that can slove the 
Elasticsearch's speical mechanism of dynamic type leniency.( I am a software 
engineer focusing on ES/OpenSearch)
   
   Looking forward to your reply, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to