python processor not updating write stats

2024-08-28 Thread chris snow
My processor is successfully writing to my destination, but the write stats are not updated. [image: image.png] I think the relevant code in my transform method is: failed_batches = self.write_to_vastdb(context, session, pa_table) if len(failed_batches) == 0: return FlowFileTransformResult(rela

Re: how to retrieve a record reader from a controller service using python processor API?

2024-08-25 Thread chris snow
ttps://github.com/apache/nifi/blob/main/nifi-docs/src/main/asciidoc/python-developer-guide.adoc#recordtransform > > > On Aug 24, 2024, at 11:13 AM, chris snow wrote: > > It seems I would need access to the ProcessSession but from what I can > understand from Java processors tha

Re: how to retrieve a record reader from a controller service using python processor API?

2024-08-24 Thread chris snow
It seems I would need access to the ProcessSession but from what I can understand from Java processors that is passed in via the on_trigger method which doesn't appear to have been implemented for Python processors? On Sat, 24 Aug 2024 at 08:22, chris snow wrote: > I have a python c

how to retrieve a record reader from a controller service using python processor API?

2024-08-24 Thread chris snow
I have a python component that users a controller service. I can't figure out from the java api docs [1] and python api source code [2] how to retrieve the record reader. Any suggestions? from nifiapi.properties import PropertyDescriptor, PythonPropertyValue from nifiapi.flowfiletransform import

Re: 2.0.0-M4 Python Processor - issue creating PropertyDescriptor dependency

2024-08-23 Thread chris snow
I found some examples here: https://github.com/apache/nifi-python-extensions/blob/main/src/extensions/chunking/ParseDocument.py On Thu, 22 Aug 2024 at 08:22, chris snow wrote: > I have the following minimal example where I'm trying to create a > dependent PropertyDescriptor >

[jira] [Updated] (NIFI-13673) Create a NiFi Python Processor Example that has PropertyDefinition dependent on another PropertyDefinition

2024-08-23 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/NIFI-13673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated NIFI-13673: -- Summary: Create a NiFi Python Processor Example that has PropertyDefinition dependent on another

[jira] [Updated] (NIFI-13673) CLONE - Create a NiFi Python Processor Example that has PropertyDefinition dependent on another PropertyDefinition

2024-08-22 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/NIFI-13673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated NIFI-13673: -- Description: I'm trying to create a  PropertyDefinition that is dependent on an

[jira] [Updated] (NIFI-13672) Create a NiFi Python Processor Example that has a Reader Service PropertyDefinition

2024-08-22 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/NIFI-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated NIFI-13672: -- Description: I'm trying to create a Reader Service PropertyDefinition, but it isn't clear

[jira] [Updated] (NIFI-13673) CLONE - Create a NiFi Python Processor Example that has PropertyDefinition dependent on another PropertyDefinition

2024-08-22 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/NIFI-13673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated NIFI-13673: -- Description: I'm trying to create a  PropertyDefinition that is dependent on an

[jira] [Updated] (NIFI-13672) Create a NiFi Python Processor Example that has a Reader Service PropertyDefinition

2024-08-22 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/NIFI-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated NIFI-13672: -- Description: I'm trying to create a Reader Service PropertyDefinition, but it isn't clear

[jira] [Created] (NIFI-13673) CLONE - Create a NiFi Python Processor Example that has PropertyDefinition dependent on another PropertyDefinition

2024-08-22 Thread chris snow (Jira)
chris snow created NIFI-13673: - Summary: CLONE - Create a NiFi Python Processor Example that has PropertyDefinition dependent on another PropertyDefinition Key: NIFI-13673 URL: https://issues.apache.org/jira/browse

[jira] [Created] (NIFI-13672) Create a NiFi Python Processor Example that has a Reader Service PropertyDefinition

2024-08-22 Thread chris snow (Jira)
chris snow created NIFI-13672: - Summary: Create a NiFi Python Processor Example that has a Reader Service PropertyDefinition Key: NIFI-13672 URL: https://issues.apache.org/jira/browse/NIFI-13672 Project

2.0.0-M4 Python Processor - issue creating PropertyDescriptor dependency

2024-08-22 Thread chris snow
I have the following minimal example where I'm trying to create a dependent PropertyDescriptor # from nifiapi.properties import PropertyDescriptor, StandardValidators from nifiapi.flowfiletransform import FlowFileTransform, FlowFileTransformResult class MinimalPythonExample(FlowFileTransform)

[sage-edu] linear algebra with SageMath course notebooks

2024-05-24 Thread chris snow
I've created an opensource course on Linear Algebra. You can find more information here: https://nbviewer.org/github/snowch/learn_linear_algebra/blob/main/notebooks/00-start-here.ipynb Hope this is useful. -- You received this message because you are subscribed to the Google Groups "sage-edu

[jira] [Comment Edited] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled

2022-01-28 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483735#comment-17483735 ] chris snow edited comment on SPARK-24668 at 1/28/22, 12:0

[jira] [Commented] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled

2022-01-28 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483735#comment-17483735 ] chris snow commented on SPARK-24668: I tried `getOrElse(None)` ... {code:java}

[jira] [Commented] (TOREE-234) Suggest expanding magics by Including ! shell commands attached

2020-09-01 Thread chris snow (Jira)
[ https://issues.apache.org/jira/browse/TOREE-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188239#comment-17188239 ] chris snow commented on TOREE-234: -- I recently used Toree with pyspark and foun

[jira] [Created] (PHOENIX-4959) Please document the different driver types (fat and thin)

2018-10-07 Thread chris snow (JIRA)
chris snow created PHOENIX-4959: --- Summary: Please document the different driver types (fat and thin) Key: PHOENIX-4959 URL: https://issues.apache.org/jira/browse/PHOENIX-4959 Project: Phoenix

Batch insert using Python API executemany()

2018-10-07 Thread chris snow
I'm investigating the executemany() api call to insert batches of records consumed from Kafka. Does this API call operate on the batch as a whole, or does it call execute() for each insert? The source code seems like it does the latter, but I'm not 100% sure. If executemany() operates on the bat

Does the Spark Plugin support Phoenix as a Spark Structured Streaming sink?

2018-10-07 Thread chris snow
Does the Spark Plugin (https://phoenix.apache.org/phoenix_spark.html) support Phoenix as a Spark Structured Streaming sink? If yes, can anyone share an example of how to do this?

Compatibility matrix delivery guarantees

2018-06-02 Thread chris snow
When is it expected that the compatibility matrix will list delivery guarantees? Is there a jira I can subscribe to for monitoring this? Many thanks!

[jira] [Commented] (FLINK-8497) KafkaConsumer throws NPE if topic doesn't exist

2018-04-25 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451826#comment-16451826 ] chris snow commented on FLINK-8497: --- Sorry for the late response. Yes, please

[jira] [Commented] (FLINK-8939) Provide better support for saving streaming data to s3

2018-03-25 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16413404#comment-16413404 ] chris snow commented on FLINK-8939: --- Ah, that makes sense if this is relate

[jira] [Commented] (FLINK-8939) Provide better support for saving streaming data to s3

2018-03-25 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16413170#comment-16413170 ] chris snow commented on FLINK-8939: --- I can’t recall seeing that issue @yanxia

[jira] [Created] (FLINK-8939) Provide better support for saving streaming data to s3

2018-03-14 Thread chris snow (JIRA)
chris snow created FLINK-8939: - Summary: Provide better support for saving streaming data to s3 Key: FLINK-8939 URL: https://issues.apache.org/jira/browse/FLINK-8939 Project: Flink Issue Type

[jira] [Created] (FLINK-8939) Provide better support for saving streaming data to s3

2018-03-14 Thread chris snow (JIRA)
chris snow created FLINK-8939: - Summary: Provide better support for saving streaming data to s3 Key: FLINK-8939 URL: https://issues.apache.org/jira/browse/FLINK-8939 Project: Flink Issue Type

[jira] [Comment Edited] (FLINK-3588) Add a streaming (exactly-once) JDBC connector

2018-03-03 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384724#comment-16384724 ] chris snow edited comment on FLINK-3588 at 3/3/18 4:3

[jira] [Commented] (FLINK-3588) Add a streaming (exactly-once) JDBC connector

2018-03-03 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384724#comment-16384724 ] chris snow commented on FLINK-3588: --- This appears to have been implemented? {code:

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-27 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378211#comment-16378211 ] chris snow commented on FLINK-8543: --- Thanks for fixing this [~aljoscha]! {quote

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-22 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372534#comment-16372534 ] chris snow commented on FLINK-8543: ---   [~aljoscha] - what are your thoughts on

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371882#comment-16371882 ] chris snow commented on FLINK-8543: --- So all files being accompanied by .valid-lengt

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371855#comment-16371855 ] chris snow commented on FLINK-8543: --- Does Flink running on EMR using S3 have the

[jira] [Comment Edited] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371882#comment-16371882 ] chris snow edited comment on FLINK-8543 at 2/21/18 7:4

[jira] [Comment Edited] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371613#comment-16371613 ] chris snow edited comment on FLINK-8543 at 2/21/18 4:5

[jira] [Comment Edited] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371613#comment-16371613 ] chris snow edited comment on FLINK-8543 at 2/21/18 4:5

[jira] [Comment Edited] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371613#comment-16371613 ] chris snow edited comment on FLINK-8543 at 2/21/18 4:4

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371613#comment-16371613 ] chris snow commented on FLINK-8543: --- My apologies, the commented out close() didn&#

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371485#comment-16371485 ] chris snow commented on FLINK-8543: --- Unfortunately, commenting out the cal

[jira] [Comment Edited] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371403#comment-16371403 ] chris snow edited comment on FLINK-8543 at 2/21/18 1:3

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371405#comment-16371405 ] chris snow commented on FLINK-8543: --- [~aljoscha] The failures seem to be happening

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371403#comment-16371403 ] chris snow commented on FLINK-8543: --- Sorry for the delay. I've added

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-20 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370123#comment-16370123 ] chris snow commented on FLINK-8543: --- [~aljoscha] - I’m using AvroKeyValueSinkWr

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-14 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364895#comment-16364895 ] chris snow commented on FLINK-8543: --- I didn’t see any errors or suspicious entrie

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-11 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359852#comment-16359852 ] chris snow commented on FLINK-8543: --- I'm hoping that I can get access to an

[jira] [Created] (FLINK-8618) Add S3 to the list of sinks on the delivery guarantees page

2018-02-09 Thread chris snow (JIRA)
chris snow created FLINK-8618: - Summary: Add S3 to the list of sinks on the delivery guarantees page Key: FLINK-8618 URL: https://issues.apache.org/jira/browse/FLINK-8618 Project: Flink Issue

[jira] [Created] (FLINK-8618) Add S3 to the list of sinks on the delivery guarantees page

2018-02-09 Thread chris snow (JIRA)
chris snow created FLINK-8618: - Summary: Add S3 to the list of sinks on the delivery guarantees page Key: FLINK-8618 URL: https://issues.apache.org/jira/browse/FLINK-8618 Project: Flink Issue

[jira] [Commented] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-07 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356524#comment-16356524 ] chris snow commented on FLINK-8543: --- One thing to note - I've used the stand

[jira] [Updated] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-01 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated FLINK-8543: -- Description: I'm hitting an issue with my BucketingSink from a streaming job.   {code:java} retur

[jira] [Updated] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-01 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated FLINK-8543: -- Attachment: Screen Shot 2018-01-30 at 18.34.51.png Description: I'm hitting an issue wi

[jira] [Created] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-01 Thread chris snow (JIRA)
chris snow created FLINK-8543: - Summary: Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen Key: FLINK-8543 URL: https://issues.apache.org/jira/browse/FLINK-8543 Project: Flink

[jira] [Created] (FLINK-8543) Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen

2018-02-01 Thread chris snow (JIRA)
chris snow created FLINK-8543: - Summary: Output Stream closed at org.apache.hadoop.fs.s3a.S3AOutputStream.checkOpen Key: FLINK-8543 URL: https://issues.apache.org/jira/browse/FLINK-8543 Project: Flink

End-to-end exactly once from kafka source to S3 sink

2018-01-28 Thread chris snow
I’m working with a kafka environment where I’m limited to 100 partitions @ 1GB log.retention.bytes each. I’m looking to implement exactly once processing from this kafka source to a S3 sink. If I have understood correctly, Flink will only commit the kafka offsets when the data has been saved to S

[jira] [Updated] (FLINK-8513) Add documentation for connecting to non-AWS S3 endpoints

2018-01-25 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated FLINK-8513: -- Description: It would be useful if the documentation provided information on connecting to non-AWS S3

[jira] [Updated] (FLINK-8513) Add documentation for connecting to non-AWS S3 endpoints

2018-01-25 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/FLINK-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated FLINK-8513: -- Description: It would be useful if the documentation provided information on connecting to non-AWS S3

[jira] [Created] (FLINK-8513) Add documentation for connecting to non-AWS S3 endpoints

2018-01-25 Thread chris snow (JIRA)
chris snow created FLINK-8513: - Summary: Add documentation for connecting to non-AWS S3 endpoints Key: FLINK-8513 URL: https://issues.apache.org/jira/browse/FLINK-8513 Project: Flink Issue Type

[jira] [Created] (FLINK-8513) Add documentation for connecting to non-AWS S3 endpoints

2018-01-25 Thread chris snow (JIRA)
chris snow created FLINK-8513: - Summary: Add documentation for connecting to non-AWS S3 endpoints Key: FLINK-8513 URL: https://issues.apache.org/jira/browse/FLINK-8513 Project: Flink Issue Type

[jira] [Created] (FLINK-8497) KafkaConsumer throws NPE if topic doesn't exist

2018-01-23 Thread chris snow (JIRA)
chris snow created FLINK-8497: - Summary: KafkaConsumer throws NPE if topic doesn't exist Key: FLINK-8497 URL: https://issues.apache.org/jira/browse/FLINK-8497 Project: Flink Issue Type

[jira] [Created] (FLINK-8497) KafkaConsumer throws NPE if topic doesn't exist

2018-01-23 Thread chris snow (JIRA)
chris snow created FLINK-8497: - Summary: KafkaConsumer throws NPE if topic doesn't exist Key: FLINK-8497 URL: https://issues.apache.org/jira/browse/FLINK-8497 Project: Flink Issue Type

[no subject]

2017-12-20 Thread chris snow

Re: Apache Kafka Connectors and Apache Camel Components

2017-10-21 Thread chris snow
d connectors as eco-systems aside from AK project. On the other > hand, keeping all the connectors within AK would introduce all dependencies > of the other data systems into AK repo. > > > Guozhang > > > > On Sat, Oct 21, 2017 at 1:28 AM, chris snow wrote: > > >

Apache Kafka Connectors and Apache Camel Components

2017-10-21 Thread chris snow
I've been working with Kafka Connect for a short while, and I can't help but contrast it with the approach taken by Apache Camel. Camel takes an inclusive approach to components - it has a huge number of components (connectors) that are included as part of the official Camel distribution. This ma

Re: KIP-99 streams global ktable - slowly changing dimension type 2 supported?

2017-10-17 Thread chris snow
able as well. > > > Guozhang > > > On Mon, Oct 16, 2017 at 12:51 PM, chris snow wrote: > > > The streams global ktable wiki page [1] describes a data warehouse syle > > operation whereby dimension tables are joined to fact tables. > > > > I’m interested i

KIP-99 streams global ktable - slowly changing dimension type 2 supported?

2017-10-16 Thread chris snow
The streams global ktable wiki page [1] describes a data warehouse syle operation whereby dimension tables are joined to fact tables. I’m interested in whether this approach works for type 2 slowly changing dimensions [2]? In type 2 scd the dimension record history is preserved and the fact table

how to save state in a polling processor?

2017-08-22 Thread chris snow
The JDBC component shows the following example for basic change data capture: from("timer://MoveNewCustomersEveryHour?period=360") .setBody(constant("select * from customer where create_time > (sysdate-1/24)")) .to("jdbc:testdb") .to("kafka:mytopic?...) I have an ID column in the

Re: Camel Kafka SASL plain with broker list rather than zookeeper

2017-08-20 Thread Chris Snow
I've raised a JIRA to track this: https://issues.apache.org/jira/browse/CAMEL-11682 -- View this message in context: http://camel.465427.n5.nabble.com/Camel-Kafka-SASL-plain-with-broker-list-rather-than-zookeeper-tp5811191p5811365.html Sent from the Camel - Users mailing list archive at Nabble.

[jira] [Created] (CAMEL-11682) Support configuration parameter sasl.jaas.config

2017-08-20 Thread chris snow (JIRA)
chris snow created CAMEL-11682: -- Summary: Support configuration parameter sasl.jaas.config Key: CAMEL-11682 URL: https://issues.apache.org/jira/browse/CAMEL-11682 Project: Camel Issue Type

Re: Camel Kafka SASL plain with broker list rather than zookeeper

2017-08-20 Thread Chris Snow
I was wondering whether camel should also support consumer.properties and producer.properties files as a configuration mechanism? This would allow developers to easily reuse their existing assets for connecting to kafka. -- View this message in context: http://camel.465427.n5.nabble.com/Camel-K

Re: Camel Kafka SASL plain with broker list rather than zookeeper

2017-08-20 Thread Chris Snow
Thanks Zoran, from the documentation it appears that I need to pass the parameters: brokers=kafka01-prod01.messagehub.services.eu-de.bluemix.net:9093kafka02-prod01.messagehub.services.eu-de.bluemix.net:9093 ... saslMechanism=PLAIN securityProtocol=SASL_SSL sslProtocol=TLSv1.2 sslEnabledProtocols=T

Camel Kafka SASL plain with broker list rather than zookeeper

2017-08-18 Thread Chris Snow
IBM MessageHub is a kafka 0.10.2.1 service that uses SASL plain to connect. The connection parameters require passing a list of bootstrap servers and a user name and password: "kafka_brokers_sasl": [ "kafka04-prod01.messagehub.services.eu-de.bluemix.net:9093", "kafka05-prod01.messagehub

[jira] [Created] (SPARK-21430) Add PMML support to SparkR

2017-07-16 Thread chris snow (JIRA)
chris snow created SPARK-21430: -- Summary: Add PMML support to SparkR Key: SPARK-21430 URL: https://issues.apache.org/jira/browse/SPARK-21430 Project: Spark Issue Type: Sub-task

what are the implications of setting `schemaSampleSize = -1` on cloudant connector?

2017-05-16 Thread chris snow
how does this option work? I'm guessing that this may add some overhead to my cloudant data load into spark because it will need to read in all the data before it creates the dataframe?

Re: Collaborative filtering steps in spark

2017-03-29 Thread chris snow
k)(random.nextGaussian().toFloat) > val nrm = blas.snrm2(rank, factor, 1) > blas.sscal(rank, 1.0f / nrm, factor, 1) > factor > } > (srcBlockId, factors) > } > } > > > factor is ~ N(0, 1) and then scaled by the L2 norm, but it looks to me the > abs value is

Collaborative filtering steps in spark

2017-03-26 Thread chris snow
In the paper “Large-Scale Parallel Collaborative Filtering for the Netflix Prize”, the following steps are described for ALS: Step 1 Initialize matrix M by assigning the average rating for that movie as the first row, and small random numbers for the remaining entries. Step 2 Fix M, Solve U by min

[jira] [Commented] (SPARK-20072) Clarify ALS-WR documentation

2017-03-26 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942473#comment-15942473 ] chris snow commented on SPARK-20072: Will do. Thanks Sean. > Clarify

[jira] [Commented] (SPARK-20072) Clarify ALS-WR documentation

2017-03-23 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938477#comment-15938477 ] chris snow commented on SPARK-20072: Fair enough. Though this did cause me

[jira] [Created] (SPARK-20072) Clarify ALS-WR documentation

2017-03-23 Thread chris snow (JIRA)
chris snow created SPARK-20072: -- Summary: Clarify ALS-WR documentation Key: SPARK-20072 URL: https://issues.apache.org/jira/browse/SPARK-20072 Project: Spark Issue Type: Improvement

Re: Collaborative Filtering - scaling of the regularization parameter

2017-03-23 Thread chris snow
Thanks Nick. If this will help other users, I'll create a JIRA and send a patch. On 23 March 2017 at 13:49, Nick Pentreath wrote: > Yup, that is true and a reasonable clarification of the doc. > > On Thu, 23 Mar 2017 at 00:03 chris snow wrote: >> >> The docum

Collaborative Filtering - scaling of the regularization parameter

2017-03-23 Thread chris snow
The documentation for collaborative filtering is as follows: === Scaling of the regularization parameter Since v1.1, we scale the regularization parameter lambda in solving each least squares problem by the number of ratings the user generated in updating user factors, or the number of ratings th

[jira] [Commented] (SPARK-20011) inconsistent terminology in als api docs and tutorial

2017-03-18 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931132#comment-15931132 ] chris snow commented on SPARK-20011: Ok, thanks. Can you please assign t

[jira] [Created] (SPARK-20011) inconsistent terminology in als api docs and tutorial

2017-03-18 Thread chris snow (JIRA)
chris snow created SPARK-20011: -- Summary: inconsistent terminology in als api docs and tutorial Key: SPARK-20011 URL: https://issues.apache.org/jira/browse/SPARK-20011 Project: Spark Issue Type

[jira] [Commented] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-12-22 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769625#comment-15769625 ] chris snow commented on SPARK-16110: +1 for backporting this to 1.6 and 2.0 >

Re: toree install issue - No module named pyspark

2016-12-21 Thread chris snow
thon/, where we are searching for your pyspark distribution. > > Your PYTHONPATH isn't even showing us adding the $SPARK_HOME/python/, which > is also troubling. > > On Wed, Dec 14, 2016 at 9:41 AM chris snow wrote: > > > I'm trying to setup toree as follows: >

toree install issue - No module named pyspark

2016-12-14 Thread chris snow
I'm trying to setup toree as follows: CLUSTER_NAME=$(curl -s -k -u $BI_USER:$BI_PASS -X GET https://${BI_HOST}:9443/api/v1/clusters | python -c 'import sys, json; print(json.load(sys.stdin)["items"][0]["Clusters"]["cluster_name"]);') echo Cluster Name: $CLUSTER_NAME CLUSTER_HOSTS=$(c

[jira] [Commented] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2016-12-06 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725420#comment-15725420 ] chris snow commented on SPARK-18230: If you are trying to indicate a non exis

[jira] [Comment Edited] (SPARK-18230) MatrixFactorizationModel.recommendProducts throws NoSuchElement exception when the user does not exist

2016-12-06 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725420#comment-15725420 ] chris snow edited comment on SPARK-18230 at 12/6/16 12:5

Re: unhelpful exception thrown on predict() when ALS trained model doesn't contain user or product?

2016-12-06 Thread chris snow
Ah cool, thanks for the link! On 6 December 2016 at 12:25, Nick Pentreath wrote: > Indeed, it's being tracked here: https://issues.apache. > org/jira/browse/SPARK-18230 though no Pr has been opened yet. > > > On Tue, 6 Dec 2016 at 13:36 chris snow wrote

unhelpful exception thrown on predict() when ALS trained model doesn't contain user or product?

2016-12-06 Thread chris snow
I'm using the MatrixFactorizationModel.predict() method and encountered the following exception: Name: java.util.NoSuchElementException Message: next on empty iterator StackTrace: scala.collection.Iterator$$anon$2.next(Iterator.scala:39) scala.collection.Iterator$$anon$2.next(Iterator.scala:37) sc

[jira] [Updated] (SPARK-18472) ALS Rating to support string product and user keys

2016-11-16 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris snow updated SPARK-18472: --- Component/s: MLlib > ALS Rating to support string product and user k

[jira] [Created] (SPARK-18472) ALS Rating to support string product and user keys

2016-11-16 Thread chris snow (JIRA)
chris snow created SPARK-18472: -- Summary: ALS Rating to support string product and user keys Key: SPARK-18472 URL: https://issues.apache.org/jira/browse/SPARK-18472 Project: Spark Issue Type

[jira] [Created] (ZEPPELIN-1382) Websocket reconnect button

2016-08-27 Thread chris snow (JIRA)
chris snow created ZEPPELIN-1382: Summary: Websocket reconnect button Key: ZEPPELIN-1382 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1382 Project: Zeppelin Issue Type: New Feature

Typical log.retention.bytes value

2016-08-26 Thread chris snow
When configuring a new Kafka cluster, is it common practice to leave the l og.retention.bytes setting at -1? If not -1, it would be interesting to know what size partitions other users have. Many thanks, Chris

[jira] [Commented] (KNOX-733) Knox shell client is susceptible to man-in-the-middle attack

2016-08-22 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430520#comment-15430520 ] chris snow commented on KNOX-733: - This sounds good. A few questions ... - If the CL

[jira] [Commented] (KNOX-733) Knox shell client is susceptible to man-in-the-middle attack

2016-08-18 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427071#comment-15427071 ] chris snow commented on KNOX-733: - I should have said CLI / library rather than just

[jira] [Commented] (KNOX-734) Shell client to have a hdfs copy operation

2016-08-18 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426070#comment-15426070 ] chris snow commented on KNOX-734: - I didn't realised that webhdfs didn't have

[jira] [Commented] (KNOX-733) Knox shell client is susceptible to man-in-the-middle attack

2016-08-18 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426013#comment-15426013 ] chris snow commented on KNOX-733: - In terms of requirements: 1) I would like the CLI t

[jira] [Created] (KNOX-734) Shell client to have a hdfs copy operation

2016-08-16 Thread chris snow (JIRA)
chris snow created KNOX-734: --- Summary: Shell client to have a hdfs copy operation Key: KNOX-734 URL: https://issues.apache.org/jira/browse/KNOX-734 Project: Apache Knox Issue Type: New Feature

[jira] [Created] (KNOX-733) Knox shell client is susceptible to man-in-the-middle attack

2016-08-16 Thread chris snow (JIRA)
chris snow created KNOX-733: --- Summary: Knox shell client is susceptible to man-in-the-middle attack Key: KNOX-733 URL: https://issues.apache.org/jira/browse/KNOX-733 Project: Apache Knox Issue

[jira] [Created] (ZEPPELIN-1314) dump on the R command line in the debug output

2016-08-09 Thread chris snow (JIRA)
chris snow created ZEPPELIN-1314: Summary: dump on the R command line in the debug output Key: ZEPPELIN-1314 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1314 Project: Zeppelin Issue

[jira] [Created] (ZEPPELIN-1252) Please provide Zeppelin precompiled for Hbase 1.2

2016-07-28 Thread chris snow (JIRA)
chris snow created ZEPPELIN-1252: Summary: Please provide Zeppelin precompiled for Hbase 1.2 Key: ZEPPELIN-1252 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1252 Project: Zeppelin

[jira] [Commented] (KNOX-713) Knox Shell HDFS.get.Request is Package Private

2016-05-21 Thread chris snow (JIRA)
[ https://issues.apache.org/jira/browse/KNOX-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294994#comment-15294994 ] chris snow commented on KNOX-713: - Awesome! Thanks Larry! When is 0.9.1 expected t

  1   2   3   4   5   6   7   8   9   10   >