Re: spark structured streaming GroupState returns weird values from sate

2020-03-31 Thread Jungtaek Lim
That seems to come from the difference how Spark infers schema and create serializer / deserializer for Java beans to construct bean encoder. When inferring schema for Java beans, all properties which have getter methods are considered. When creating serializer / deserializer, only properties whic

Re: spark structured streaming GroupState returns weird values from sate

2020-03-31 Thread Srinivas V
Never mind. It got resolved after I removed extra two getter methods (to calculate duration) I created in my State specific Java bean (ProductSessionInformation). But I am surprised why it has created so much problem. I guess when this bean is converted to Scala class it may not be taking care of n

Re: spark structured streaming GroupState returns weird values from sate

2020-03-28 Thread Srinivas V
Ok, I will try to create some simple code to reproduce, if I can. Problem is that I am adding this code in an existing big project with several dependencies with spark streaming older version(2.2) on root level etc. Also, I observed that there is @Experimental on GroupState class. What state is it

Re: spark structured streaming GroupState returns weird values from sate

2020-03-28 Thread Jungtaek Lim
I have't heard known issue for this - that said, this may require new investigation which is not possible or require huge effort without simple reproducer. Contributors (who are basically volunteers) may not want to struggle to reproduce from your partial information - I'd recommend you to spend y

Re: spark structured streaming GroupState returns weird values from sate

2020-03-28 Thread Srinivas V
Sorry for typos , correcting them below On Sat, Mar 28, 2020 at 4:39 PM Srinivas V wrote: > Sorry I was just changing some names not to send exact names. Please > ignore that. I am really struggling with this since couple of days. Can > this happen due to > 1. some of the values being null or >

Re: spark structured streaming GroupState returns weird values from sate

2020-03-28 Thread Srinivas V
Sorry I was just changing some names not to send exact names. Please ignore that. I am really struggling with this sine couple of days. Can this happen due to 1. some of the values being null or 2.UTF8 issue ? Or some sterilization/ deserilization issue ? 3. Not enough memory ? I am using same nam

Re: spark structured streaming GroupState returns weird values from sate

2020-03-27 Thread Jungtaek Lim
Well, the code itself doesn't seem to be OK - you're using ProductStateInformation as the class of State whereas you provide ProductSessionInformation to Encoder for State. On Fri, Mar 27, 2020 at 11:14 PM Jungtaek Lim wrote: > Could you play with Encoders.bean()? You can Encoders.bean() with yo

Re: spark structured streaming GroupState returns weird values from sate

2020-03-27 Thread Jungtaek Lim
Could you play with Encoders.bean()? You can Encoders.bean() with your class, and call .schema() with the return value to see how it transforms to the schema in Spark SQL. The schema must be consistent across multiple JVM runs to make it work properly, but I suspect it doesn't retain the order. On

spark structured streaming GroupState returns weird values from sate

2020-03-27 Thread Srinivas V
I am listening to Kafka topic with a structured streaming application with Java, testing it on my local Mac. When I retrieve back GroupState object with state.get(), it is giving some random values for the fields in the object, some are interchanging some are default and some are junk values. See