join doesn't work

2020-08-06 Thread nt
I've using streamline pulsar connector, each dataset receives the data properly but cannot make join to be working Dataset datasetPolicyWithWtm = datasetPolicy.withWatermark("__publishTime", "5 minutes").as("pol"); Dataset datasetPhoneWithWtm = datasetPhone.withWatermark("__publishTime", "5

file importing / hibernate

2020-08-05 Thread nt
1. I need to import csv files with a entity resolution logic, spark could help me to process rows in parallel Do you think is a good approach ? 2. I've quite complex database structure and eager to use i.e. hibernate to resolve and save the data but it seems like everybody uses plain jdbc is this