#general


@joshhighley: Ingesting JSON data into a realtime table. A field in the JSON is a JSON string with leading spaces but is always numeric data otherwise: ```{ "account":" 123", .....}``` If my realtime table defines the account column as DOUBLE, then the record loads with no issue -- the spaces appear to be ignored. However, if I define the column as INT then the record does not load. More troublesome, I can't find any error messages in any of the logs -- I would expect some kind of error message?
  @mayanks: Thanks for reporting @joshhighley, let me take a look at the code, and will get back to you
  @mayanks: @joshhighley I did a small experiment, Double.parseDouble can parse " 123", but Integer.parseInt throws NumberFormatException. I suspect that the exception is being swallowed. Either way, seems like a bug.
  @mayanks: For a temporary work-around, is it possible for you to strip the leading spaces? And also file an issue, so we can fix this.
  @joshhighley: No, unfortunately, it's not practical for us to parse the record prior, modify the value, then write it out again.
  @joshhighley: using a Double column type will probably be our workaround
  @joshhighley: I submitted issue #6634
  @mayanks: Thanks for submitting the issue.
  @g.kishore: @mayanks so jave trims the string for double parse but not for integer parse?
  @mayanks: Yes
  @g.kishore: That’s bizarre. Should be a simple fix but might add perf overhead for things that are already trimmed may be try trim only on exception?
  @mayanks: This was a standalone unit test that I did, I'll take a look at where in the code we do the type conversion a little later.
  @mayanks: ```Double.parseDouble(" 123"); -> 123.0```
  @mayanks: ```Integer.parseInt(" 123"); -> NumberFormatException```
@joshhighley: BTW, if I remove the leading spaces from the String, then it will convert successfully to int. I tried using a data transformation to do this, but they aren't allow to transform a column to the same column.
@nachiket.kate: @nachiket.kate has joined the channel
@m.e.driscoll: @m.e.driscoll has joined the channel
@lloyd.branch: @lloyd.branch has joined the channel
@csanderson.data: @csanderson.data has joined the channel
@miliang: @miliang has joined the channel
@joshhighley: When streaming data via Kafka to a realtime table, does it have to be 1 record per message or is there a way to put multiple records in a single message?

#random


@nachiket.kate: @nachiket.kate has joined the channel
@m.e.driscoll: @m.e.driscoll has joined the channel
@lloyd.branch: @lloyd.branch has joined the channel
@csanderson.data: @csanderson.data has joined the channel
@miliang: @miliang has joined the channel

#feat-presto-connector


@dutta.kinshuk: @dutta.kinshuk has joined the channel

#troubleshooting


@nachiket.kate: @nachiket.kate has joined the channel
@elon.azoulay: Hi, we have an issue where the pinot servers are in a crash loop, they cannot start up. The servers are spewing tons of messages like : ```[HelixTaskExecutor] [ZkClient-EventThread-23-pinot-us-central1-zookeeper:2181] SessionId does NOT match. expected sessionId: 300000c69e5009a, tgtSessionId in message: 300000c69e50099, messageId: 9d191304-00cc-4138-bb57-7997a960fab0```
  @elon.azoulay: When I look in the errors section of the zookeeper browser I see: ```"id": "300000c69e50084__enriched_customer_orders_jp_upsert_realtime_streaming_v1_REALTIME", "simpleFields": {}, "mapFields": { "HELIX_ERROR 20210303-100525.000929 STATE_TRANSITION 7f8da719-5667-4d33-adb9-76a8010c9c56": { "AdditionalInfo": "Exception while executing a state transition task enriched_customer_orders_jp_upsert_realtime_streaming_v1__7__330__20210224T2322Zjava.lang.reflect.InvocationTargetException\n\tat jdk.internal.reflect.GeneratedMethodAccessor452.invoke(Unknown Source)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:834)\nCaused by: java.util.NoSuchElementException: 'segment.total.docs' doesn't map to an existing object\n\tat org.apache.commons.configuration.AbstractConfiguration.getInt(AbstractConfiguration.java:816)\n\tat org.apache.pinot.core.segment.index.metadata.SegmentMetadataImpl.<init>(SegmentMetadataImpl.java:128)\n\tat org.apache.pinot.core.segment.index.loader.SegmentPreProcessor.<init>(SegmentPreProcessor.java:71)\n\tat org.apache.pinot.core.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:98)\n\tat org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:283)\n\tat org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:133)\n\tat org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164)\n\t... 11 more\n", "Class": "class org.apache.helix.messaging.handling.HelixStateTransitionHandler", "MSG_ID": "8237ad10-da30-4ad9-8b80-930b437d48fa", "Message state": "READ" }, "HELIX_ERROR 20210303-100532.000104 STATE_TRANSITION 24244f32-ca45-463b-9c15-586d5a667669": { "AdditionalInfo": "Message execution failed. msgId: 8237ad10-da30-4ad9-8b80-930b437d48fa, errorMsg: java.lang.reflect.InvocationTargetException", "Class": "class org.apache.helix.messaging.handling.HelixStateTransitionHandler", "MSG_ID": "8237ad10-da30-4ad9-8b80-930b437d48fa", "Message state": "READ" }```
  @jackie.jxt: Based on the error message, seems the segment `enriched_customer_orders_jp_upsert_realtime_streaming_v1__7__330__20210224T2322Z` is clasped.
  @jackie.jxt: Does this happen to only one server or all servers?
  @elon.azoulay: only the tenants where it exists.
  @jackie.jxt: If you have time, we can have a quick zoom chat to debug the issue
  @elon.azoulay: wow, I owe you one:) Sure whenever you have some time.
  @jackie.jxt:
@m.e.driscoll: @m.e.driscoll has joined the channel
@lloyd.branch: @lloyd.branch has joined the channel
@csanderson.data: @csanderson.data has joined the channel
@miliang: @miliang has joined the channel
@miliang: Hey, it seems there is a bug in most recent code of pinot. This kind of query will throws exception:
@miliang: But it previously works well:
@fx19880617: I think the in clause should use single quote
  @miliang: ```SELECT jsonExtractScalar(mapDim2json, '$.non-existing-key', 'INT') FROM FeatureTest1 WHERE bytesDimSV1 = 'deed0507' AND jsonExtractKey(mapDim2json, '$.*') in ('$[non-existing-key]')```
  @miliang: or ```SELECT jsonExtractScalar(mapDim2json, '$.non-existing-key', 'INT') FROM FeatureTest1 WHERE bytesDimSV1 = 'deed0507' AND jsonExtractKey(mapDim2json, '$.*') in ('$[\'non-existing-key\']')```
  @fx19880617: right
  @fx19880617:
  @fx19880617: Pinot uses single quote for literals and double quote for identifiers
@fx19880617: the previous version of pinot doesn’t check on that, so it will return empty results always

#onboarding


@nachiket.kate: @nachiket.kate has joined the channel

#aggregators


@nachiket.kate: @nachiket.kate has joined the channel

#pinot-dev


@dutta.kinshuk: @dutta.kinshuk has joined the channel

#pinot-docs


@dutta.kinshuk: @dutta.kinshuk has joined the channel

#pinot-perf-tuning


@nachiket.kate: @nachiket.kate has joined the channel

#feat-partial-upsert


@yupeng: @npawar @jackie.jxt @tingchen could you review this doc again and approve it at the top ?
@npawar: @npawar has joined the channel
@yupeng: also , i hope we all agree on the merger interface
@jackie.jxt: The merger interface looks good. How we handle the merge of each column is implementation details
@jackie.jxt: The interface should be row based
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Reply via email to