from:"kevin chen"

Re: spark-submit parameters about two keytab files to yarn and kafka

2020-11-01 Thread kevin chen

Hi, Hope it can solve the issue by following method: *step 1 : * create a kafka kerberos config named kafka_client_jaas.conf: KafkaClient { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="./kafka.service.keytab" storeKey=true useTicketCache=false

Re: [Spark Core] Vectorizing very high-dimensional data sourced in long format

2020-11-01 Thread kevin chen

Perhaps it can avoid errors(exhausting executor and driver memory) to add random numbers to the entity_id column when you solve the issue by Patrick's way. Daniel Chalef 于2020年10月31日周六上午12:42写道： > Yes, the resulting matrix would be sparse. Thanks for the suggestion. Will > explore ways of