[GitHub] nifi pull request #2736: NIFI-5223 Allow the usage of expression language fo...

2018-05-23 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2736

NIFI-5223 Allow the usage of expression language for properties of Re…

…cordSetWriters

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-5223

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2736.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2736


commit 62db98f86ed397e95c4f92eb45435c39aa932ec0
Author: JohannesDaniel 
Date:   2018-05-22T19:54:48Z

NIFI-5223 Allow the usage of expression language for properties of 
RecordSetWriters




---


[GitHub] nifi issue #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-14 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2675
  
@markap14 implemented changes as discussed


---


[GitHub] nifi issue #2700: NIFI-5189 Schema name is not available for RecordSchema

2018-05-14 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2700
  
Discussion related to this ticket: 
https://github.com/apache/nifi/pull/2675#discussion_r186106245


---


[GitHub] nifi issue #2700: NIFI-5189 Schema name is not available for RecordSchema

2018-05-14 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2700
  
@markap14 I wanted to fix this before finishing the XMLRecordSetWriter. I 
especially expect beginners to use the schema access via the text property - 
they should not run into this problem.


---


[GitHub] nifi pull request #2700: NIFI-5189 Schema name is not available for RecordSc...

2018-05-14 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2700

NIFI-5189 Schema name is not available for RecordSchema

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-5189

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2700.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2700


commit 4ff6be45148e6e6d4fa1121fec72dc7c51fbad22
Author: JohannesDaniel 
Date:   2018-05-14T14:28:46Z

NIFI-5189 Schema name is not available for RecordSchema




---


[GitHub] nifi pull request #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-11 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2675#discussion_r187592686
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/WriteXMLResult.java
 ---
@@ -0,0 +1,602 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import javanet.staxutils.IndentingXMLStreamWriter;
+import org.apache.nifi.NullSuppression;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.schema.access.SchemaAccessWriter;
+import org.apache.nifi.serialization.AbstractRecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.WriteResult;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.RawRecordWriter;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.ChoiceDataType;
+import org.apache.nifi.serialization.record.type.MapDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLOutputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.XMLStreamWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.function.Supplier;
+
+
+public class WriteXMLResult extends AbstractRecordSetWriter implements 
RecordSetWriter, RawRecordWriter {
+
+final ComponentLog logger;
+final RecordSchema recordSchema;
+final SchemaAccessWriter schemaAccess;
+final XMLStreamWriter writer;
+final NullSuppression nullSuppression;
+final ArrayWrapping arrayWrapping;
+final String arrayTagName;
+final String recordTagName;
+final String rootTagName;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public WriteXMLResult(final ComponentLog logger, final RecordSchema 
recordSchema, final SchemaAccessWriter schemaAccess, final OutputStream out, 
final boolean prettyPrint,
+  final NullSuppression nullSuppression, final 
ArrayWrapping arrayWrapping, final String arrayTagName, final String 
rootTagName, final String recordTagName,
+  final String dateFormat, final String 
timeFormat, final String timestampFormat) throws IOException {
+
+super(out);
+
+this.logger = logger;
+this.recordSchema = recordSchema;
+this.schemaAccess = schemaAccess;
+this.nullSuppression = nullSuppression;
+
+this.arrayWrapping = arrayWrapping;
+this.arrayTagName = arrayTagName;
+
+this.rootTagName = rootTagName;
+this.recordTagName = recordTagName;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+XMLOutputFactory factory = XMLOutputFactory.newInstance();
+
   

[GitHub] nifi pull request #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-11 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2675#discussion_r187560805
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordSetWriter.java
 ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.NullSuppression;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.ValidationContext;
+import org.apache.nifi.components.ValidationResult;
+import org.apache.nifi.expression.ExpressionLanguageScope;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeTextRecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+@Tags({"xml", "resultset", "writer", "serialize", "record", "recordset", 
"row"})
+@CapabilityDescription("Writes a RecordSet to XML. The records are wrapped 
by a root tag.")
+public class XMLRecordSetWriter extends DateTimeTextRecordSetWriter 
implements RecordSetWriterFactory {
+
+public static final AllowableValue ALWAYS_SUPPRESS = new 
AllowableValue("always-suppress", "Always Suppress",
+"Fields that are missing (present in the schema but not in the 
record), or that have a value of null, will not be written out");
+public static final AllowableValue NEVER_SUPPRESS = new 
AllowableValue("never-suppress", "Never Suppress",
+"Fields that are missing (present in the schema but not in the 
record), or that have a value of null, will be written out as a null value");
+public static final AllowableValue SUPPRESS_MISSING = new 
AllowableValue("suppress-missing", "Suppress Missing Values",
+"When a field has a value of null, it will be written out. 
However, if a field is defined in the schema and not present in the record, the 
field will not be written out.");
+
+public static final AllowableValue USE_PROPERTY_AS_WRAPPER = new 
AllowableValue("use-property-as-wrapper", "Use Property as Wrapper",
+"The value of the property \"Array Tag Name\" will be used as 
the tag name to wrap elements of an array. The field name of the array field 
will be used for the tag name " +
+"of the elements.");
+public static final AllowableValue USE_PROPERTY_FOR_ELEMENTS = new 
AllowableValue("use-property-for-elements", "Use Property for Elements",
+"The value of the property \"Array Tag Name\" will be used for 
the tag name of the elements of an array. The field name of the array field 
will be used as the tag name " +
+"to wrap elements.");
+public static final AllowableValue NO_WRAPPING = new 
AllowableValue("no-wrapping", "No Wrapping",
+"The elements of an array will not be wrapped");
+
+public static final PropertyDescriptor SUPPRESS_NULLS = new 
PropertyDescriptor.Builder()
+.name("suppres

[GitHub] nifi pull request #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-11 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2675#discussion_r187549020
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/WriteXMLResult.java
 ---
@@ -0,0 +1,602 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import javanet.staxutils.IndentingXMLStreamWriter;
+import org.apache.nifi.NullSuppression;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.schema.access.SchemaAccessWriter;
+import org.apache.nifi.serialization.AbstractRecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.WriteResult;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.RawRecordWriter;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.ChoiceDataType;
+import org.apache.nifi.serialization.record.type.MapDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLOutputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.XMLStreamWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.function.Supplier;
+
+
+public class WriteXMLResult extends AbstractRecordSetWriter implements 
RecordSetWriter, RawRecordWriter {
+
+final ComponentLog logger;
+final RecordSchema recordSchema;
+final SchemaAccessWriter schemaAccess;
+final XMLStreamWriter writer;
+final NullSuppression nullSuppression;
+final ArrayWrapping arrayWrapping;
+final String arrayTagName;
+final String recordTagName;
+final String rootTagName;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public WriteXMLResult(final ComponentLog logger, final RecordSchema 
recordSchema, final SchemaAccessWriter schemaAccess, final OutputStream out, 
final boolean prettyPrint,
+  final NullSuppression nullSuppression, final 
ArrayWrapping arrayWrapping, final String arrayTagName, final String 
rootTagName, final String recordTagName,
+  final String dateFormat, final String 
timeFormat, final String timestampFormat) throws IOException {
+
+super(out);
+
+this.logger = logger;
+this.recordSchema = recordSchema;
+this.schemaAccess = schemaAccess;
+this.nullSuppression = nullSuppression;
+
+this.arrayWrapping = arrayWrapping;
+this.arrayTagName = arrayTagName;
+
+this.rootTagName = rootTagName;
+this.recordTagName = recordTagName;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+XMLOutputFactory factory = XMLOutputFactory.newInstance();
+
   

[GitHub] nifi pull request #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-04 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2675#discussion_r186106245
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordSetWriter.java
 ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.NullSuppression;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.ValidationContext;
+import org.apache.nifi.components.ValidationResult;
+import org.apache.nifi.expression.ExpressionLanguageScope;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeTextRecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+
+@Tags({"xml", "resultset", "writer", "serialize", "record", "recordset", 
"row"})
+@CapabilityDescription("Writes a RecordSet to XML. The records are wrapped 
by a root tag.")
+public class XMLRecordSetWriter extends DateTimeTextRecordSetWriter 
implements RecordSetWriterFactory {
+
+public static final AllowableValue ALWAYS_SUPPRESS = new 
AllowableValue("always-suppress", "Always Suppress",
+"Fields that are missing (present in the schema but not in the 
record), or that have a value of null, will not be written out");
+public static final AllowableValue NEVER_SUPPRESS = new 
AllowableValue("never-suppress", "Never Suppress",
+"Fields that are missing (present in the schema but not in the 
record), or that have a value of null, will be written out as a null value");
+public static final AllowableValue SUPPRESS_MISSING = new 
AllowableValue("suppress-missing", "Suppress Missing Values",
+"When a field has a value of null, it will be written out. 
However, if a field is defined in the schema and not present in the record, the 
field will not be written out.");
+
+public static final AllowableValue USE_PROPERTY_AS_WRAPPER = new 
AllowableValue("use-property-as-wrapper", "Use Property as Wrapper",
+"The value of the property \"Array Tag Name\" will be used as 
the tag name to wrap elements of an array. The field name of the array field 
will be used for the tag name " +
+"of the elements.");
+public static final AllowableValue USE_PROPERTY_FOR_ELEMENTS = new 
AllowableValue("use-property-for-elements", "Use Property for Elements",
+"The value of the property \"Array Tag Name\" will be used for 
the tag name of the elements of an array. The field name of the array field 
will be used as the tag name " +
+"to wrap elements.");
+public static final AllowableValue NO_WRAPPING = new 
AllowableValue("no-wrapping", "No Wrapping",
+"The elements of an array will not be wrapped");
+
+public static final PropertyDescriptor SUPPRESS_NULLS = new 
PropertyDescriptor.Builder()
+.name("suppres

[GitHub] nifi pull request #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-04 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2675#discussion_r186106598
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/WriteXMLResult.java
 ---
@@ -0,0 +1,602 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import javanet.staxutils.IndentingXMLStreamWriter;
+import org.apache.nifi.NullSuppression;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.schema.access.SchemaAccessWriter;
+import org.apache.nifi.serialization.AbstractRecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.WriteResult;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.RawRecordWriter;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.ChoiceDataType;
+import org.apache.nifi.serialization.record.type.MapDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLOutputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.XMLStreamWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.function.Supplier;
+
+
+public class WriteXMLResult extends AbstractRecordSetWriter implements 
RecordSetWriter, RawRecordWriter {
+
+final ComponentLog logger;
+final RecordSchema recordSchema;
+final SchemaAccessWriter schemaAccess;
+final XMLStreamWriter writer;
+final NullSuppression nullSuppression;
+final ArrayWrapping arrayWrapping;
+final String arrayTagName;
+final String recordTagName;
+final String rootTagName;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public WriteXMLResult(final ComponentLog logger, final RecordSchema 
recordSchema, final SchemaAccessWriter schemaAccess, final OutputStream out, 
final boolean prettyPrint,
+  final NullSuppression nullSuppression, final 
ArrayWrapping arrayWrapping, final String arrayTagName, final String 
rootTagName, final String recordTagName,
+  final String dateFormat, final String 
timeFormat, final String timestampFormat) throws IOException {
+
+super(out);
+
+this.logger = logger;
+this.recordSchema = recordSchema;
+this.schemaAccess = schemaAccess;
+this.nullSuppression = nullSuppression;
+
+this.arrayWrapping = arrayWrapping;
+this.arrayTagName = arrayTagName;
+
+this.rootTagName = rootTagName;
+this.recordTagName = recordTagName;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+XMLOutputFactory factory = XMLOutputFactory.newInstance();
--- 

[GitHub] nifi issue #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-04 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2675
  
Hi @markap14 

here we go!

Initially, I planned to enable this writer also to write XML attributes, 
but that would have required a complex workaround. Furthermore, it would have 
raised several additional questions, e. g. how to treat fields of complex types 
that are flagged as attributes, how to validate the schema, and so on. 

@pvillard31 How about a performance test ;)


---


[GitHub] nifi pull request #2675: NIFI-5113 Add XMLRecordSetWriter

2018-05-04 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2675

NIFI-5113 Add XMLRecordSetWriter

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-5113

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2675.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2675


commit 58600a6655a8b1ce11cbc329c9feaa308c240f08
Author: JohannesDaniel 
Date:   2018-04-23T19:35:40Z

Add XMLRecordSetWriter




---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-23 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r183489279
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -84,6 +84,10 @@ public XMLRecordReader(InputStream in, RecordSchema 
schema, String rootName, Str
 
 try {
 final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
--- End diff --

@tballison Thank you for the advice! I refactored this in a way that only 
the local part is considered. 
:)


---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-23 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
@markap14 thanks for the help with the patch file. I am fine with it :)

If you have not planned to do that yourself, I would start implementing a 
XMLWriter as soon as this has been merged. Or should we first discuss your plan 
with the Record attributes you mentioned above?


---


[GitHub] nifi issue #2650: NIFI-5106 Add provenance reporting to GetSolr

2018-04-22 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2650
  
@MikeThomsen Hi, just saw that GetSolr does not create receive provenance 
events when data is retrieved and written into flowfiles. Added that.


---


[GitHub] nifi pull request #2650: NIFI-5106 Add provenance reporting to GetSolr

2018-04-22 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2650

NIFI-5106 Add provenance reporting to GetSolr

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi 
NIFI-5106-provenanceGetSolr

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2650.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2650


commit 43f326e2b78cd5a0d40d1647b3a28b640fe8e784
Author: JohannesDaniel 
Date:   2018-04-22T19:06:33Z

Added provenance reporting




---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-22 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
@markap14 
- Added EL for record format property
- Removed record tag validation


---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-20 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
@markap14 thank you for the response. 

I will simply remove that record tag validation as there are indeed many 
ways to do that before the data is processed by this reader. 

There is one little corner case, I need to discuss: 
Assuming we have the following data:
```
value1...
```
If the reader is used with (coerce==true), the field "map_field" can be 
parsed by defining a map in the schema. The embedded key fields do not have to 
be defined, its values only have to be of the defined type for the map. 

If the reader is used with (coerce==false && dropUnknown==true), the reader 
will parse all fields that exist in the schema ignoring its type. However, the 
data above will not be parsable even if the map exists in the schema. In this 
case, the reader identifies "map_field" as a field that exists in the schema, 
but the reader is not aware that it is of type map. Therefore, the reader will 
not parse the embedded key fields, as they don't exist in the schema. The field 
"map_field" will be classified as an empty field and not added to the record.

Furthermore, even if the reader is used with (coerce==false && 
dropUnknown==true), it will be type-aware to some extent. The reader first 
checks, whether fields exist in the schema. If that is the case, the reader 
additionally will check whether they are of type record (or of type array 
embedding records, respectively). If that is also the case, the reader will 
retrieve the subschema in order to be enabled to check whether subtags of the 
current tag are known. 



---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-19 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
@markap14 @pvillard31 
- I refactored some code as the cases (coerce==true && drop==false) and 
(coerce==false && drop==true) in some cases showed an unexpected behavior
- Data like contentcontentcontent now can be 
parsed
- Maps (e. g. 
value1value2) are now supported
- The reader is now able to parse single records (e. g. 
) as well as arrays of records (e. g. 
). I added a property to make it configurable 
whether the reader shall expect a single record or an array. One question: As 
there are only two options for this, I defined AllowableValues for this 
property. Despite that, I think it would be reasonable to enable EL for this 
property. But how can this be realized?
- I removed the root validation, but remained the check for record tag 
names in order to support processing data like this 
 (tag "other" will be ignored if check 
for record tag name is activated)


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-17 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen 
- added provenance receive notifications
- added tests for attributes
- added upper limit of solr start parameter (1)
- enhanced documentation for upper limit of start param and parameters for 
facet and stats


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-17 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
(means in 5-6 hous)


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-17 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
hi @MikeThomsen I will upload the changes this evening.


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-16 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181851731
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLEventReader;
+import javax.xml.stream.XMLInputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.events.Attribute;
+import javax.xml.stream.events.Characters;
+import javax.xml.stream.events.StartElement;
+import javax.xml.stream.events.XMLEvent;
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class XMLRecordReader implements RecordReader {
+
+private final ComponentLog logger;
+private final RecordSchema schema;
+private final String recordName;
+private final String attributePrefix;
+private final String contentFieldName;
+
+// thread safety required?
+private StartElement currentRecordStartTag;
+
+private final XMLEventReader xmlEventReader;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public XMLRecordReader(InputStream in, RecordSchema schema, String 
rootName, String recordName, String attributePrefix, String contentFieldName,
+   final String dateFormat, final String 
timeFormat, final String timestampFormat, final ComponentLog logger) throws 
MalformedRecordException {
+this.schema = schema;
+this.recordName = recordName;
+this.attributePrefix = attributePrefix;
+this.contentFieldName = contentFieldName;
+this.logger = logger;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
+
+xmlEventReader = xmlInputFactory.createXMLEventReader(in);
+final StartElement rootTag = getNextStartTag();
+
+// root tag validation
+if (rootName != null && 
!rootName.equals(rootTag.getName().toString())) {
+final St

[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-16 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181653767
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLEventReader;
+import javax.xml.stream.XMLInputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.events.Attribute;
+import javax.xml.stream.events.Characters;
+import javax.xml.stream.events.StartElement;
+import javax.xml.stream.events.XMLEvent;
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class XMLRecordReader implements RecordReader {
+
+private final ComponentLog logger;
+private final RecordSchema schema;
+private final String recordName;
+private final String attributePrefix;
+private final String contentFieldName;
+
+// thread safety required?
+private StartElement currentRecordStartTag;
+
+private final XMLEventReader xmlEventReader;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public XMLRecordReader(InputStream in, RecordSchema schema, String 
rootName, String recordName, String attributePrefix, String contentFieldName,
+   final String dateFormat, final String 
timeFormat, final String timestampFormat, final ComponentLog logger) throws 
MalformedRecordException {
+this.schema = schema;
+this.recordName = recordName;
+this.attributePrefix = attributePrefix;
+this.contentFieldName = contentFieldName;
+this.logger = logger;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
+
+xmlEventReader = xmlInputFactory.createXMLEventReader(in);
+final StartElement rootTag = getNextStartTag();
+
+// root tag validation
+if (rootName != null && 
!rootName.equals(rootTag.getName().toString())) {
+final St

[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-04-13 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r181483327
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/QuerySolr.java
 ---
@@ -0,0 +1,584 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.nifi.processors.solr;
+
+import com.google.gson.stream.JsonWriter;
+import org.apache.nifi.annotation.behavior.InputRequirement;
+import org.apache.nifi.annotation.behavior.WritesAttribute;
+import org.apache.nifi.annotation.behavior.WritesAttributes;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.ValidationContext;
+import org.apache.nifi.components.ValidationResult;
+import org.apache.nifi.expression.AttributeExpression;
+import org.apache.nifi.expression.ExpressionLanguageScope;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.apache.solr.client.solrj.response.FacetField;
+import org.apache.solr.client.solrj.response.FieldStatsInfo;
+import org.apache.solr.client.solrj.response.IntervalFacet;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.response.RangeFacet;
+import org.apache.solr.client.solrj.response.RangeFacet.Count;
+import org.apache.solr.common.params.CommonParams;
+import org.apache.solr.common.params.FacetParams;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.StatsParams;
+
+import java.io.IOException;
+import java.io.OutputStreamWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_TYPE;
+import static org.apache.nifi.processors.solr.SolrUtils.COLLECTION;
+import static 
org.apache.nifi.processors.solr.SolrUtils.JAAS_CLIENT_APP_NAME;
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_TYPE_CLOUD;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SSL_CONTEXT_SERVICE;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_SOCKET_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_CONNECTION_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+import static org.apache.nifi.processors.solr.SolrUtils.ZK_CLIENT_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.ZK_CONNECTION_TIMEOUT;
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_LOCATION;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_USERNAME

[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-13 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen There are two tests that test Solr result paging:
```
testRetrievalOfFullResults()
testRetrievalOfFullResults2()
```
What do you mean exactly? A test that retrieves more results from Solr? 
What actually shall be the purpose of this test?


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-13 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@ottobackwards you made my day!


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-13 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@ottobackwards When I try this I always run into the same problem:

[ERROR] Failed to execute goal on project nifi-livy-processors: Could not 
resolve dependencies for project 
org.apache.nifi:nifi-livy-processors:jar:1.7.0-SNAPSHOT: Could not find 
artifact org.apache.nifi:nifi-standard-processors:jar:tests:1.7.0-SNAPSHOT in 
apache.snapshots (https://repository.apache.org/snapshots) -> [Help 1]
[ERROR] Failed to execute goal on project nifi-slack-processors: Could not 
resolve dependencies for project 
org.apache.nifi:nifi-slack-processors:jar:1.7.0-SNAPSHOT: Could not find 
artifact org.apache.nifi:nifi-standard-processors:jar:tests:1.7.0-SNAPSHOT in 
apache.snapshots (https://repository.apache.org/snapshots) -> [Help 1]

already deleted everything in .m2/repository/org/apache/nifi
reimported / reran buid --> same error message
I never had that with prior versions of nifi


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-13 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen I'm still impeded, because my Maven does not download the 
NIFI-SNAPSHOTS. It downloads all dependencies except the SNAPSHOTS. When I 
follow the links (e. g. 
https://repository.apache.org/snapshots/org/apache/nifi/nifi-processor-utils/1.7.0-SNAPSHOT/nifi-processor-utils-1.7.0-SNAPSHOT.pom)
 I get a 404. Shouldn't I be able to see any dependencies at 
https://repository.apache.org/content/groups/snapshots/org/apache/nifi/?


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181221789
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.annotation.lifecycle.OnEnabled;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.controller.ConfigurationContext;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeUtils;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.RecordReaderFactory;
+import org.apache.nifi.serialization.SchemaRegistryService;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+@Tags({"xml", "record", "reader", "parser"})
+@CapabilityDescription("Reads XML content and creates Record objects. 
Records are expected in the second level of " +
+"XML data, embedded in an enclosing root tag.")
+public class XMLReader extends SchemaRegistryService implements 
RecordReaderFactory {
+
+public static final PropertyDescriptor VALIDATE_ROOT_TAG = new 
PropertyDescriptor.Builder()
+.name("validate_root_tag")
+.displayName("Validate Root Tag")
+.description("If this property is set, the name of root tags 
(e. g. ...) of incoming FlowFiles will be 
evaluated against this value. " +
+"In the case of a mismatch, an exception is thrown. 
The treatment of such FlowFiles depends on the implementation " +
+"of respective Processors.")
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.expressionLanguageSupported(true)
+.required(false)
+.build();
+
+public static final PropertyDescriptor VALIDATE_RECORD_TAG = new 
PropertyDescriptor.Builder()
--- End diff --

(non-record shall be skipped)


---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
@markap14 thank you for the comprehensive review. I will start refactoring 
the implementations with respect to the improvements that are clear.


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181218609
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLEventReader;
+import javax.xml.stream.XMLInputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.events.Attribute;
+import javax.xml.stream.events.Characters;
+import javax.xml.stream.events.StartElement;
+import javax.xml.stream.events.XMLEvent;
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class XMLRecordReader implements RecordReader {
+
+private final ComponentLog logger;
+private final RecordSchema schema;
+private final String recordName;
+private final String attributePrefix;
+private final String contentFieldName;
+
+// thread safety required?
+private StartElement currentRecordStartTag;
+
+private final XMLEventReader xmlEventReader;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public XMLRecordReader(InputStream in, RecordSchema schema, String 
rootName, String recordName, String attributePrefix, String contentFieldName,
+   final String dateFormat, final String 
timeFormat, final String timestampFormat, final ComponentLog logger) throws 
MalformedRecordException {
+this.schema = schema;
+this.recordName = recordName;
+this.attributePrefix = attributePrefix;
+this.contentFieldName = contentFieldName;
+this.logger = logger;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
+
+xmlEventReader = xmlInputFactory.createXMLEventReader(in);
+final StartElement rootTag = getNextStartTag();
+
+// root tag validation
+if (rootName != null && 
!rootName.equals(rootTag.getName().toString())) {
+final St

[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181218182
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLEventReader;
+import javax.xml.stream.XMLInputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.events.Attribute;
+import javax.xml.stream.events.Characters;
+import javax.xml.stream.events.StartElement;
+import javax.xml.stream.events.XMLEvent;
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class XMLRecordReader implements RecordReader {
+
+private final ComponentLog logger;
+private final RecordSchema schema;
+private final String recordName;
+private final String attributePrefix;
+private final String contentFieldName;
+
+// thread safety required?
+private StartElement currentRecordStartTag;
+
+private final XMLEventReader xmlEventReader;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public XMLRecordReader(InputStream in, RecordSchema schema, String 
rootName, String recordName, String attributePrefix, String contentFieldName,
+   final String dateFormat, final String 
timeFormat, final String timestampFormat, final ComponentLog logger) throws 
MalformedRecordException {
+this.schema = schema;
+this.recordName = recordName;
+this.attributePrefix = attributePrefix;
+this.contentFieldName = contentFieldName;
+this.logger = logger;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
+
+xmlEventReader = xmlInputFactory.createXMLEventReader(in);
+final StartElement rootTag = getNextStartTag();
+
+// root tag validation
+if (rootName != null && 
!rootName.equals(rootTag.getName().toString())) {
+final St

[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181217494
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLEventReader;
+import javax.xml.stream.XMLInputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.events.Attribute;
+import javax.xml.stream.events.Characters;
+import javax.xml.stream.events.StartElement;
+import javax.xml.stream.events.XMLEvent;
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class XMLRecordReader implements RecordReader {
+
+private final ComponentLog logger;
+private final RecordSchema schema;
+private final String recordName;
+private final String attributePrefix;
+private final String contentFieldName;
+
+// thread safety required?
+private StartElement currentRecordStartTag;
+
+private final XMLEventReader xmlEventReader;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public XMLRecordReader(InputStream in, RecordSchema schema, String 
rootName, String recordName, String attributePrefix, String contentFieldName,
+   final String dateFormat, final String 
timeFormat, final String timestampFormat, final ComponentLog logger) throws 
MalformedRecordException {
+this.schema = schema;
+this.recordName = recordName;
+this.attributePrefix = attributePrefix;
+this.contentFieldName = contentFieldName;
+this.logger = logger;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
+
+xmlEventReader = xmlInputFactory.createXMLEventReader(in);
+final StartElement rootTag = getNextStartTag();
+
+// root tag validation
+if (rootName != null && 
!rootName.equals(rootTag.getName().toString())) {
+final St

[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181215801
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLRecordReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ArrayDataType;
+import org.apache.nifi.serialization.record.type.RecordDataType;
+import org.apache.nifi.serialization.record.util.DataTypeUtils;
+
+import javax.xml.stream.XMLEventReader;
+import javax.xml.stream.XMLInputFactory;
+import javax.xml.stream.XMLStreamException;
+import javax.xml.stream.events.Attribute;
+import javax.xml.stream.events.Characters;
+import javax.xml.stream.events.StartElement;
+import javax.xml.stream.events.XMLEvent;
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.DateFormat;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.function.Supplier;
+
+public class XMLRecordReader implements RecordReader {
+
+private final ComponentLog logger;
+private final RecordSchema schema;
+private final String recordName;
+private final String attributePrefix;
+private final String contentFieldName;
+
+// thread safety required?
+private StartElement currentRecordStartTag;
+
+private final XMLEventReader xmlEventReader;
+
+private final Supplier LAZY_DATE_FORMAT;
+private final Supplier LAZY_TIME_FORMAT;
+private final Supplier LAZY_TIMESTAMP_FORMAT;
+
+public XMLRecordReader(InputStream in, RecordSchema schema, String 
rootName, String recordName, String attributePrefix, String contentFieldName,
+   final String dateFormat, final String 
timeFormat, final String timestampFormat, final ComponentLog logger) throws 
MalformedRecordException {
+this.schema = schema;
+this.recordName = recordName;
+this.attributePrefix = attributePrefix;
+this.contentFieldName = contentFieldName;
+this.logger = logger;
+
+final DateFormat df = dateFormat == null ? null : 
DataTypeUtils.getDateFormat(dateFormat);
+final DateFormat tf = timeFormat == null ? null : 
DataTypeUtils.getDateFormat(timeFormat);
+final DateFormat tsf = timestampFormat == null ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
+
+LAZY_DATE_FORMAT = () -> df;
+LAZY_TIME_FORMAT = () -> tf;
+LAZY_TIMESTAMP_FORMAT = () -> tsf;
+
+try {
+final XMLInputFactory xmlInputFactory = 
XMLInputFactory.newInstance();
+
+// Avoid namespace replacements
+
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
--- End diff --

ok, I will activate namespaces and implement some tests for this.


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181215337
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.annotation.lifecycle.OnEnabled;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.controller.ConfigurationContext;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeUtils;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.RecordReaderFactory;
+import org.apache.nifi.serialization.SchemaRegistryService;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+@Tags({"xml", "record", "reader", "parser"})
+@CapabilityDescription("Reads XML content and creates Record objects. 
Records are expected in the second level of " +
--- End diff --

when I started implementing this reader, I was wondering, how the reader 
knows whether to parse wrapped records or a single record. unfortunately we 
dont have an unambiguous indicator like we have for json: [ vs. { 
I considered to make it configurable with EL whether the reader shall 
expect a single record or an array of records. what do you think?


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181213473
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.annotation.lifecycle.OnEnabled;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.controller.ConfigurationContext;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeUtils;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.RecordReaderFactory;
+import org.apache.nifi.serialization.SchemaRegistryService;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+@Tags({"xml", "record", "reader", "parser"})
+@CapabilityDescription("Reads XML content and creates Record objects. 
Records are expected in the second level of " +
+"XML data, embedded in an enclosing root tag.")
+public class XMLReader extends SchemaRegistryService implements 
RecordReaderFactory {
+
+public static final PropertyDescriptor VALIDATE_ROOT_TAG = new 
PropertyDescriptor.Builder()
+.name("validate_root_tag")
+.displayName("Validate Root Tag")
+.description("If this property is set, the name of root tags 
(e. g. ...) of incoming FlowFiles will be 
evaluated against this value. " +
+"In the case of a mismatch, an exception is thrown. 
The treatment of such FlowFiles depends on the implementation " +
+"of respective Processors.")
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.expressionLanguageSupported(true)
+.required(false)
+.build();
+
+public static final PropertyDescriptor VALIDATE_RECORD_TAG = new 
PropertyDescriptor.Builder()
+.name("validate_record_tag")
+.displayName("Validate Record Tag")
+.description("If this property is set, the name of record tags 
(e. g. ...) of incoming FlowFiles will be 
evaluated against this value. " +
+"In the case of a mismatch, the respective record will 
be skipped. If this property is not set, each level 2 starting tag will be 
treated " +
+"as the beginning of a record.")
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.expressionLanguageSupported(true)
+.required(false)
+.build();
+
+public static final PropertyDescriptor ATTRIBUTE_PREFIX = new 
PropertyDescriptor.Builder()
+.name("attribute_prefix")
+.displayName("Attribute Prefix")
+.description("If this property is set, the name of attributes 
will be appended by a prefix when they are added to a record.")
--- End diff --

ok


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r181213403
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.annotation.lifecycle.OnEnabled;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.controller.ConfigurationContext;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeUtils;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.RecordReaderFactory;
+import org.apache.nifi.serialization.SchemaRegistryService;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+@Tags({"xml", "record", "reader", "parser"})
+@CapabilityDescription("Reads XML content and creates Record objects. 
Records are expected in the second level of " +
+"XML data, embedded in an enclosing root tag.")
+public class XMLReader extends SchemaRegistryService implements 
RecordReaderFactory {
+
+public static final PropertyDescriptor VALIDATE_ROOT_TAG = new 
PropertyDescriptor.Builder()
+.name("validate_root_tag")
+.displayName("Validate Root Tag")
+.description("If this property is set, the name of root tags 
(e. g. ...) of incoming FlowFiles will be 
evaluated against this value. " +
+"In the case of a mismatch, an exception is thrown. 
The treatment of such FlowFiles depends on the implementation " +
+"of respective Processors.")
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.expressionLanguageSupported(true)
+.required(false)
+.build();
+
+public static final PropertyDescriptor VALIDATE_RECORD_TAG = new 
PropertyDescriptor.Builder()
--- End diff --

my original intention actually was to enable users to parse recordsets like 
this
```

  ...
  ...
  ...


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@ottobackwards deep paging should be done with scrolling using cursor marks 
(as it is done in GetSolr). simple paging can be done by sucessively increasing 
the offset (start parameter). however, using cursor marks requires sorting for 
id. more information can be found here: 
https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen Thank you for your help with Git!!
@bbende 
- Solr core is a test dependency again (I actually had a reason to add this 
as a non-test dependency, but it was not a good reason...)
- Processor is now called QuerySolr
- I added the common parameters for Solr as properties (except fq). For 
those parameters that can be set multiple times, I adopted your logic of 
PutSolrContentStream. These parameters can be added as dynamic properties, e. 
g. fq.1=field:value and fq.2=field:value. I inserted this method into SolrUtils 
and adjusted the pattern a little bit (parameters containing dots are also 
allowed now, e. g. facet.field). However, we could consider to allow even more 
characters. Currently, only word characters and dots are allowed. The Solr 
committers highly recommend to name fields only with these characters, but it 
is not mandatory. Probably, a pattern like ".*\.\d+$" would be more suitable.
- I made it configurable whether the processor only returns top results 
(for one request in a single flowfile) or full result sets (for multiple 
requests in multiple flowfiles with rows as batch size). In the latter case, 
the processor pages through the results. Additionally, the cursor mark for 
responses is added as attribute, so users principally should be enabled to 
build their own looping via dataflows.


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen done :)


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen or is this a normal thing that the commits of others are shown 
here after the rebase?


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-12 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen omg. now there are commits of others in the branch?? maybe I 
should simply close this PR and insert the code into a new branch.


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-11 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen rebased everything, build for solr processors works without 
any problems. however, when I try to build the whole application, I receive the 
following error:
[ERROR] Failed to execute goal on project nifi-livy-processors: Could not 
resolve dependencies for project 
org.apache.nifi:nifi-livy-processors:jar:1.7.0-SNAPSHOT: Failure to find 
org.apache.nifi:nifi-standard-processors:jar:tests:1.7.0-SNAPSHOT in 
https://repository.apache.org/snapshots was cached in the local repository, 
resolution will not be reattempted until the update interval of 
apache.snapshots has elapsed or updates are forced -> [Help 1]
is this a known problem?


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-04-11 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen Yeah, this time I looked at this, but haven't had time to fix 
this. I think I have to merge master into the branch due to updates. Local 
build worked. But thanks for the ping :)


---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-01 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
Can you maybe post the XML that led to the empty record?


---


[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-04-01 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
Hi @pvillard31 

thank you for your comments! I realized all your suggestions. I like your 
news regarding the performance :-) Which kind of transformation did you test? 
XML => Record or XML => JSON (e. g. with ConvertRecord)?

For any reason some tests disappeared for a certain commit at my local git 
(probably, I wanted to reorder the tests,  but deleted them, omg ...). However, 
I inserted them again (this is why there are many more tests now). 

In addition, I adjusted the definition about how namespaces shall be 
treated. 

I implemented several tests for XMLReader to verify that the usage of 
expression language works as expected.

However, I was not able to reproduce your observation regarding the empty 
record for the header 
```

```

I implemented the following tests:
```
testSimpleRecordWithHeader()
testSimpleRecordWithHeaderNoValidation()
```

Actually, they work as expected. 


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-01 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r178470648
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/resources/docs/org.apache.nifi.xml.XMLReader/additionalDetails.html
 ---
@@ -0,0 +1,378 @@
+
+
+
+
+
+XMLReader
+
+
+
+
+
+The XMLReader Controller Service reads XML content and creates 
Record objects. The Controller Service
+must be configured with a schema that describes the structure of 
the XML data. Fields in the XML data
+that are not defined in the schema will be skipped.
+
+
+Records are expected in the second level of the XML data, embedded 
within an enclosing root tag:
+
+
+
+<root>
+  <record>
+<field1>content</field1>
+<field2>content</field2>
+  </record>
+  <record>
+<field1>content</field1>
+<field2>content</field2>
+  </record>
+</root>
+
+
+
+
+For the following examples, it is assumed that the exemplary 
records are enclosed by a root tag.
+
+
+Example 1: Simple Fields
+
+
+The simplest kind of data within XML data are tags / fields only 
containing content (no attributes, no embedded tags).
+They can be described in the schema by simple types (e. g. INT, 
STRING, ...).
+
+
+
+
+<record>
+  <simple_field>content</simple_field>
+</record>
+
+
+
+
+This record can be described by a schema containing one field (e. 
g. of type string). By providing this schema,
+the reader expects zero or one occurrences of "simple_field" in 
the record.
+
+
+
+
+{
+  "namespace": "nifi",
+  "name": "test",
+  "type": "record",
+  "fields": [
+{ "name": "simple_field", "type": "string" }
+  ]
+}
+
+
+
+Example 2: Arrays with Simple Fields
+
+
+Arrays are considered as repetitive tags / fields in XML data. For 
the following XML data, "array_field" is considered
+to be an array enclosing simple fields, whereas "simple_field" is 
considered to be a simple field not enclosed in
+an array.
+
+
+
+
+<record>
+  <array_field>content</array_field>
+  <array_field>content</array_field>
+  <simple_field>content</simple_field>
+</record>
+
+
+
+
+This record can be described by the following schema:
+
+
+
+
+{
+  "namespace": "nifi",
+  "name": "test",
+  "type": "record",
+  "fields": [
+{ "name": "array_field", "type":
+  { "type": "array", "items": string }
+},
+{ "name": "simple_field", "type": "string" }
+  ]
+}
+
+
+
+
+If a field in a schema is embedded in an array, the reader expects 
zero, one or more occurrences of the field
+in a record. The field "array_field" principally also could be 
defined as a simple field, but then the second occurrence
+of this field would replace the first in the record object. 
Moreover, the field "simple_field" could also be defined
+as an array. In this case, the reader would put it into the record 
object as an array with one element.
+
+
+Example 3: Tags with Attributes
+
+
+XML fields frequently not only contain

[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-01 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r178470638
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/xml/XMLReader.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.xml;
+
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.annotation.lifecycle.OnEnabled;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.controller.ConfigurationContext;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.DateTimeUtils;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.RecordReader;
+import org.apache.nifi.serialization.RecordReaderFactory;
+import org.apache.nifi.serialization.SchemaRegistryService;
+import org.apache.nifi.serialization.record.RecordSchema;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+@Tags({"xml", "record", "reader", "parser"})
+@CapabilityDescription("Reads XML content and creates Record objects. 
Records are expected in the second level of " +
+"XML data, embedded in an enclosing root tag.")
+public class XMLReader extends SchemaRegistryService implements 
RecordReaderFactory {
+
+public static final PropertyDescriptor VALIDATE_ROOT_TAG = new 
PropertyDescriptor.Builder()
--- End diff --

done. additionally, I added some tests for class XMLReader


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-04-01 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2587#discussion_r178470625
  
--- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/resources/docs/org.apache.nifi.xml.XMLReader/additionalDetails.html
 ---
@@ -0,0 +1,378 @@
+
+
+
+
+
+XMLReader
+
+
+
+
+
+The XMLReader Controller Service reads XML content and creates 
Record objects. The Controller Service
+must be configured with a schema that describes the structure of 
the XML data. Fields in the XML data
+that are not defined in the schema will be skipped.
+
+
+Records are expected in the second level of the XML data, embedded 
within an enclosing root tag:
+
+
+
+<root>
+  <record>
+<field1>content</field1>
+<field2>content</field2>
+  </record>
+  <record>
+<field1>content</field1>
+<field2>content</field2>
+  </record>
+</root>
+
+
+
+
+For the following examples, it is assumed that the exemplary 
records are enclosed by a root tag.
+
+
+Example 1: Simple Fields
+
+
+The simplest kind of data within XML data are tags / fields only 
containing content (no attributes, no embedded tags).
+They can be described in the schema by simple types (e. g. INT, 
STRING, ...).
+
+
+
+
+<record>
+  <simple_field>content</simple_field>
+</record>
+
+
+
+
+This record can be described by a schema containing one field (e. 
g. of type string). By providing this schema,
+the reader expects zero or one occurrences of "simple_field" in 
the record.
+
+
+
+
+{
+  "namespace": "nifi",
+  "name": "test",
+  "type": "record",
+  "fields": [
+{ "name": "simple_field", "type": "string" }
+  ]
+}
+
+
+
+Example 2: Arrays with Simple Fields
+
+
+Arrays are considered as repetitive tags / fields in XML data. For 
the following XML data, "array_field" is considered
+to be an array enclosing simple fields, whereas "simple_field" is 
considered to be a simple field not enclosed in
+an array.
+
+
+
+
+<record>
+  <array_field>content</array_field>
+  <array_field>content</array_field>
+  <simple_field>content</simple_field>
+</record>
+
+
+
+
+This record can be described by the following schema:
+
+
+
+
+{
+  "namespace": "nifi",
+  "name": "test",
+  "type": "record",
+  "fields": [
+{ "name": "array_field", "type":
+  { "type": "array", "items": string }
+},
+{ "name": "simple_field", "type": "string" }
+  ]
+}
+
+
+
+
+If a field in a schema is embedded in an array, the reader expects 
zero, one or more occurrences of the field
+in a record. The field "array_field" principally also could be 
defined as a simple field, but then the second occurrence
+of this field would replace the first in the record object. 
Moreover, the field "simple_field" could also be defined
+as an array. In this case, the reader would put it into the record 
object as an array with one element.
+
+
+Example 3: Tags with Attributes
+
+
+XML fields frequently not only contain

[GitHub] nifi issue #2587: NIFI-4185 Add XML Record Reader

2018-03-27 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2587
  
@pvillard31 here we go :)


---


[GitHub] nifi pull request #2587: NIFI-4185 Add XML Record Reader

2018-03-27 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2587

NIFI-4185 Add XML Record Reader

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-4185

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2587.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2587


commit 9b4bd0dd8f1d30bfe1597d4cd069df414eb968a0
Author: JohannesDaniel 
Date:   2018-03-06T23:02:43Z

Add XML Record Reader




---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-27 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
Hi @MikeThomsen 
travic-ci test failed due to "The log length has exceeded the limit of 4 MB 
(this usually means that the test suite is raising the same exception over and 
over)." Is there a possibility to change the log level?

I added an IT and added a version to the dependency.


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-24 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen I will keep an eye on the CI tests in the future, thanks for 
the advice. Actually, I did not take them into account as they frequently 
appear to fail for no reason...
I tested the processor within a local NiFi build with Solr (cloud and 
no-cloud) running locally. Everything worked fine. Is that what you meant with 
integration tests? What do you mean with FetchSolrIT? Do you have a link for an 
example?


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-24 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen ok, sorry, I am too impatient :D


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-23 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen any news?


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-16 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
Oh, sorry. Done.


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-16 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@MikeThomsen Any news?


---


[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-10 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
Refactored the treatment of flowfiles routed to relationship ORIGINAL. More 
attributes are added to flowfiles for better descriptions of requests / 
responses. Additionally, I adjusted some tests.


---


[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r173614856
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/FetchSolr.java
 ---
@@ -0,0 +1,401 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.nifi.processors.solr;
+
+import com.google.gson.stream.JsonWriter;
+import org.apache.nifi.annotation.behavior.InputRequirement;
+import org.apache.nifi.annotation.behavior.WritesAttribute;
+import org.apache.nifi.annotation.behavior.WritesAttributes;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.expression.AttributeExpression;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.io.OutputStreamCallback;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.apache.solr.client.solrj.response.FacetField;
+import org.apache.solr.client.solrj.response.FieldStatsInfo;
+import org.apache.solr.client.solrj.response.IntervalFacet;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.response.RangeFacet;
+import org.apache.solr.client.solrj.response.RangeFacet.Count;
+import org.apache.solr.common.params.FacetParams;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.StatsParams;
+import org.apache.solr.servlet.SolrRequestParsers;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_TYPE;
+import static org.apache.nifi.processors.solr.SolrUtils.COLLECTION;
+import static 
org.apache.nifi.processors.solr.SolrUtils.JAAS_CLIENT_APP_NAME;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SSL_CONTEXT_SERVICE;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_SOCKET_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_CONNECTION_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+import static org.apache.nifi.processors.solr.SolrUtils.ZK_CLIENT_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.ZK_CONNECTION_TIMEOUT;
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_LOCATION;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_USERNAME;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_PASSWORD;
+import static org.apache.nifi.processors.solr.SolrUtils.RECORD_WRITER;
+
+@Tags({"Apache", &q

[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
Tested the processor in a local build where it worked as expected.


---


[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r173530516
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/FetchSolr.java
 ---
@@ -0,0 +1,401 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.nifi.processors.solr;
+
+import com.google.gson.stream.JsonWriter;
+import org.apache.nifi.annotation.behavior.InputRequirement;
+import org.apache.nifi.annotation.behavior.WritesAttribute;
+import org.apache.nifi.annotation.behavior.WritesAttributes;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.expression.AttributeExpression;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.io.OutputStreamCallback;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.apache.solr.client.solrj.response.FacetField;
+import org.apache.solr.client.solrj.response.FieldStatsInfo;
+import org.apache.solr.client.solrj.response.IntervalFacet;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.response.RangeFacet;
+import org.apache.solr.client.solrj.response.RangeFacet.Count;
+import org.apache.solr.common.params.FacetParams;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.StatsParams;
+import org.apache.solr.servlet.SolrRequestParsers;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_TYPE;
+import static org.apache.nifi.processors.solr.SolrUtils.COLLECTION;
+import static 
org.apache.nifi.processors.solr.SolrUtils.JAAS_CLIENT_APP_NAME;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SSL_CONTEXT_SERVICE;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_SOCKET_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_CONNECTION_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+import static org.apache.nifi.processors.solr.SolrUtils.ZK_CLIENT_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.ZK_CONNECTION_TIMEOUT;
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_LOCATION;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_USERNAME;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_PASSWORD;
+import static org.apache.nifi.processors.solr.SolrUtils.RECORD_WRITER;
+
+@Tags({"Apache", &q

[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r173530358
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/test/java/org/apache/nifi/processors/solr/TestFetchSolr.java
 ---
@@ -0,0 +1,380 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.nifi.processors.solr;
+
+import com.google.gson.stream.JsonReader;
+import org.apache.nifi.json.JsonRecordSetWriter;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.reporting.InitializationException;
+import org.apache.nifi.schema.access.SchemaAccessUtils;
+import org.apache.nifi.util.MockFlowFile;
+import org.apache.nifi.util.TestRunner;
+import org.apache.nifi.util.TestRunners;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.common.SolrInputDocument;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.xmlunit.matchers.CompareMatcher;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Locale;
+import java.util.TimeZone;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertThat;
+
+public class TestFetchSolr {
+static final String DEFAULT_SOLR_CORE = "testCollection";
+
+private static final SimpleDateFormat df = new 
SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSS'Z'", Locale.US);
+static {
+df.setTimeZone(TimeZone.getTimeZone("GMT"));
+}
+
+private SolrClient solrClient;
+
+@Before
+public void setup() {
+
+try {
+
+// create an EmbeddedSolrServer for the processor to use
+String relPath = 
getClass().getProtectionDomain().getCodeSource()
+.getLocation().getFile() + "../../target";
+
+solrClient = 
EmbeddedSolrServerFactory.create(EmbeddedSolrServerFactory.DEFAULT_SOLR_HOME,
+DEFAULT_SOLR_CORE, relPath);
+
+for (int i = 0; i < 10; i++) {
+SolrInputDocument doc = new SolrInputDocument();
+doc.addField("id", "doc" + i);
+Date date = new Date();
+doc.addField("created", df.format(date));
+doc.addField("string_single", "single" + i + ".1");
+doc.addField("string_multi", "multi" + i + ".1");
+doc.addField("string_multi", "multi" + i + ".2");
+doc.addField("integer_single", i);
+doc.addField("integer_multi", 1);
+doc.addField("integer_multi", 2);
+doc.addField("integer_multi", 3);
+doc.addField("double_single", 0.5 + i);
+
+solrClient.add(doc);
+System.out.println(doc.getField("created").getValue());
+
+}
+solrClient.commit();
+} catch (Exception e) {
+e.printStackTrace();
+Assert.fail(e.getMessage());
+}
+}
+
+@After
+public void teardown() {
+try {
+solrClient.close();
+} catch (Exception e) {
+}
+}
+
+@Test
+public void testAllFacetCategories() throws IOException {
+final TestableProcessor proc = new TestableProce

[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r173516196
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/FetchSolr.java
 ---
@@ -0,0 +1,401 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.nifi.processors.solr;
+
+import com.google.gson.stream.JsonWriter;
+import org.apache.nifi.annotation.behavior.InputRequirement;
+import org.apache.nifi.annotation.behavior.WritesAttribute;
+import org.apache.nifi.annotation.behavior.WritesAttributes;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.expression.AttributeExpression;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.io.OutputStreamCallback;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.apache.solr.client.solrj.response.FacetField;
+import org.apache.solr.client.solrj.response.FieldStatsInfo;
+import org.apache.solr.client.solrj.response.IntervalFacet;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.response.RangeFacet;
+import org.apache.solr.client.solrj.response.RangeFacet.Count;
+import org.apache.solr.common.params.FacetParams;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.StatsParams;
+import org.apache.solr.servlet.SolrRequestParsers;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_TYPE;
+import static org.apache.nifi.processors.solr.SolrUtils.COLLECTION;
+import static 
org.apache.nifi.processors.solr.SolrUtils.JAAS_CLIENT_APP_NAME;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SSL_CONTEXT_SERVICE;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_SOCKET_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_CONNECTION_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+import static org.apache.nifi.processors.solr.SolrUtils.ZK_CLIENT_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.ZK_CONNECTION_TIMEOUT;
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_LOCATION;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_USERNAME;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_PASSWORD;
+import static org.apache.nifi.processors.solr.SolrUtils.RECORD_WRITER;
+
+@Tags({"Apache", &q

[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r173516051
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrUtils.java
 ---
@@ -66,6 +67,15 @@
 public static final AllowableValue SOLR_TYPE_STANDARD = new 
AllowableValue(
 "Standard", "Standard", "A stand-alone Solr instance.");
 
+public static final PropertyDescriptor RECORD_WRITER = new 
PropertyDescriptor
--- End diff --

This is a really nice idea. However, I would prefer to do this in a 
separate ticket.


---


[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-09 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2517#discussion_r173515717
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/FetchSolr.java
 ---
@@ -0,0 +1,401 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.nifi.processors.solr;
+
+import com.google.gson.stream.JsonWriter;
+import org.apache.nifi.annotation.behavior.InputRequirement;
+import org.apache.nifi.annotation.behavior.WritesAttribute;
+import org.apache.nifi.annotation.behavior.WritesAttributes;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.expression.AttributeExpression;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.io.OutputStreamCallback;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.schema.access.SchemaNotFoundException;
+import org.apache.nifi.serialization.RecordSetWriter;
+import org.apache.nifi.serialization.RecordSetWriterFactory;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.apache.solr.client.solrj.response.FacetField;
+import org.apache.solr.client.solrj.response.FieldStatsInfo;
+import org.apache.solr.client.solrj.response.IntervalFacet;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.response.RangeFacet;
+import org.apache.solr.client.solrj.response.RangeFacet.Count;
+import org.apache.solr.common.params.FacetParams;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.StatsParams;
+import org.apache.solr.servlet.SolrRequestParsers;
+
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_TYPE;
+import static org.apache.nifi.processors.solr.SolrUtils.COLLECTION;
+import static 
org.apache.nifi.processors.solr.SolrUtils.JAAS_CLIENT_APP_NAME;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SSL_CONTEXT_SERVICE;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_SOCKET_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_CONNECTION_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS;
+import static 
org.apache.nifi.processors.solr.SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+import static org.apache.nifi.processors.solr.SolrUtils.ZK_CLIENT_TIMEOUT;
+import static 
org.apache.nifi.processors.solr.SolrUtils.ZK_CONNECTION_TIMEOUT;
+import static org.apache.nifi.processors.solr.SolrUtils.SOLR_LOCATION;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_USERNAME;
+import static org.apache.nifi.processors.solr.SolrUtils.BASIC_PASSWORD;
+import static org.apache.nifi.processors.solr.SolrUtils.RECORD_WRITER;
+
+@Tags({"Apache", &q

[GitHub] nifi issue #2517: NIFI-4516 FetchSolr Processor

2018-03-06 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2517
  
@ijokarumawak 
To be aligned with processor GetSolr I implemented two options for the data 
format of Solr results: Solr XML and record functions. However, facets and 
stats are written to flowfiles in JSON (which has the same structure like the 
Solr-JSON). I did not implement record management for these two components to 
keep the complexity of the processor at a reasonably level. I chose JSON as it 
is probably the best integrated format in NiFi.


---


[GitHub] nifi pull request #2517: NIFI-4516 FetchSolr Processor

2018-03-06 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2517

NIFI-4516 FetchSolr Processor

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-4516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2517


commit 07434cca4907932d55a2d7abe25710ba75f03a4c
Author: JohannesDaniel 
Date:   2018-03-06T22:43:49Z

FetchSolr Processor finalized




---


[GitHub] nifi pull request #2516: NIFI-4516 FetchSolr Processor

2018-03-06 Thread JohannesDaniel
Github user JohannesDaniel closed the pull request at:

https://github.com/apache/nifi/pull/2516


---


[GitHub] nifi issue #2516: Nifi 4516

2018-03-06 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2516
  
@ijokarumawak 
To be aligned with processor GetSolr I implemented two options for the data 
format of Solr results: Solr XML and record functions. However, facets and 
stats are written to flowfiles in JSON (which has the same structure like the 
Solr-JSON). I did not implement record management for these two components to 
keep the complexity of the processor at a reasonably level. I chose JSON as it 
is probably the best integrated format in NiFi. 


---


[GitHub] nifi pull request #2516: Nifi 4516

2018-03-06 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2516

Nifi 4516

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-4516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2516


commit 0e99a9e52e5537830ca90b318f9684304998b6f4
Author: Pierre Villard 
Date:   2018-03-02T17:08:29Z

NIFI-4922 - Add badges to the README file

Signed-off-by: Pierre Villard 
Signed-off-by: James Wing 

This closes #2505.

commit c585f6e10df4d510207698243147928698948c17
Author: JohannesDaniel 
Date:   2018-03-05T16:30:42Z

NIFI-4516 FetchSolr-Processor




---


[GitHub] nifi issue #2285: NIFI-4583 Restructure nifi-solr-processors

2018-01-03 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2285
  
Ah I see. Sorry for this. Will we leave it as it (for this commit) is or 
can we fix this?


---


[GitHub] nifi issue #2285: NIFI-4583 Restructure nifi-solr-processors

2018-01-03 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2285
  
@ijokarumawak not too important, but do you know why this commit is not 
linked to my account? it is linked to "U-WOODMARK\johannes.peter"


---


[GitHub] nifi issue #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-30 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2285
  
I replaced the values by static imports


---


[GitHub] nifi issue #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-30 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2285
  
I made TestGetSolr a bit more simpler by adding a method for the generic 
creation of TestRunners. I moved all remaining PropertyDescriptors from 
SolrProcessor to SolrUtils as they will be needed also for e. g. 
ControllerServices. 


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-30 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2285#discussion_r159123401
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java
 ---
@@ -34,39 +31,33 @@
 import org.apache.nifi.processor.util.StandardValidators;
 import org.apache.nifi.ssl.SSLContextService;
 import org.apache.solr.client.solrj.SolrClient;
-import org.apache.solr.client.solrj.impl.CloudSolrClient;
-import org.apache.solr.client.solrj.impl.HttpClientUtil;
-import org.apache.solr.client.solrj.impl.HttpSolrClient;
 import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;
-import org.apache.solr.common.params.ModifiableSolrParams;
 
-import javax.net.ssl.SSLContext;
 import javax.security.auth.login.Configuration;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
-import java.util.concurrent.TimeUnit;
 
 /**
  * A base class for processors that interact with Apache Solr.
  *
  */
 public abstract class SolrProcessor extends AbstractProcessor {
 
-public static final AllowableValue SOLR_TYPE_CLOUD = new 
AllowableValue(
-"Cloud", "Cloud", "A SolrCloud instance.");
-
-public static final AllowableValue SOLR_TYPE_STANDARD = new 
AllowableValue(
-"Standard", "Standard", "A stand-alone Solr instance.");
-
-public static final PropertyDescriptor SOLR_TYPE = new 
PropertyDescriptor
-.Builder().name("Solr Type")
-.description("The type of Solr instance, Cloud or Standard.")
-.required(true)
-.allowableValues(SOLR_TYPE_CLOUD, SOLR_TYPE_STANDARD)
-.defaultValue(SOLR_TYPE_STANDARD.getValue())
-.build();
+// make PropertyDescriptors of SolrUtils visible for classes extending 
SolrProcessor
+public static AllowableValue SOLR_TYPE_CLOUD = 
SolrUtils.SOLR_TYPE_CLOUD;
+public static AllowableValue SOLR_TYPE_STANDARD = 
SolrUtils.SOLR_TYPE_STANDARD;
+public static PropertyDescriptor SOLR_TYPE = SolrUtils.SOLR_TYPE;
+public static PropertyDescriptor COLLECTION = SolrUtils.COLLECTION;
+public static PropertyDescriptor JAAS_CLIENT_APP_NAME = 
SolrUtils.JAAS_CLIENT_APP_NAME;
+public static PropertyDescriptor SSL_CONTEXT_SERVICE = 
SolrUtils.SSL_CONTEXT_SERVICE;
+public static PropertyDescriptor SOLR_SOCKET_TIMEOUT = 
SolrUtils.SOLR_SOCKET_TIMEOUT;
+public static PropertyDescriptor SOLR_CONNECTION_TIMEOUT = 
SolrUtils.SOLR_CONNECTION_TIMEOUT;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS = 
SolrUtils.SOLR_MAX_CONNECTIONS;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS_PER_HOST = 
SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+public static PropertyDescriptor ZK_CLIENT_TIMEOUT = 
SolrUtils.ZK_CLIENT_TIMEOUT;
+public static PropertyDescriptor ZK_CONNECTION_TIMEOUT = 
SolrUtils.ZK_CONNECTION_TIMEOUT;
--- End diff --

You are right! I would add new PropertyDescriptors  only to SolrProcessor 
if the are only supposed to be used for processors. 


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-30 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2285#discussion_r159122738
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java
 ---
@@ -34,39 +31,33 @@
 import org.apache.nifi.processor.util.StandardValidators;
 import org.apache.nifi.ssl.SSLContextService;
 import org.apache.solr.client.solrj.SolrClient;
-import org.apache.solr.client.solrj.impl.CloudSolrClient;
-import org.apache.solr.client.solrj.impl.HttpClientUtil;
-import org.apache.solr.client.solrj.impl.HttpSolrClient;
 import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;
-import org.apache.solr.common.params.ModifiableSolrParams;
 
-import javax.net.ssl.SSLContext;
 import javax.security.auth.login.Configuration;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
-import java.util.concurrent.TimeUnit;
 
 /**
  * A base class for processors that interact with Apache Solr.
  *
  */
 public abstract class SolrProcessor extends AbstractProcessor {
 
-public static final AllowableValue SOLR_TYPE_CLOUD = new 
AllowableValue(
-"Cloud", "Cloud", "A SolrCloud instance.");
-
-public static final AllowableValue SOLR_TYPE_STANDARD = new 
AllowableValue(
-"Standard", "Standard", "A stand-alone Solr instance.");
-
-public static final PropertyDescriptor SOLR_TYPE = new 
PropertyDescriptor
-.Builder().name("Solr Type")
-.description("The type of Solr instance, Cloud or Standard.")
-.required(true)
-.allowableValues(SOLR_TYPE_CLOUD, SOLR_TYPE_STANDARD)
-.defaultValue(SOLR_TYPE_STANDARD.getValue())
-.build();
+// make PropertyDescriptors of SolrUtils visible for classes extending 
SolrProcessor
+public static AllowableValue SOLR_TYPE_CLOUD = 
SolrUtils.SOLR_TYPE_CLOUD;
+public static AllowableValue SOLR_TYPE_STANDARD = 
SolrUtils.SOLR_TYPE_STANDARD;
+public static PropertyDescriptor SOLR_TYPE = SolrUtils.SOLR_TYPE;
+public static PropertyDescriptor COLLECTION = SolrUtils.COLLECTION;
+public static PropertyDescriptor JAAS_CLIENT_APP_NAME = 
SolrUtils.JAAS_CLIENT_APP_NAME;
+public static PropertyDescriptor SSL_CONTEXT_SERVICE = 
SolrUtils.SSL_CONTEXT_SERVICE;
+public static PropertyDescriptor SOLR_SOCKET_TIMEOUT = 
SolrUtils.SOLR_SOCKET_TIMEOUT;
+public static PropertyDescriptor SOLR_CONNECTION_TIMEOUT = 
SolrUtils.SOLR_CONNECTION_TIMEOUT;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS = 
SolrUtils.SOLR_MAX_CONNECTIONS;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS_PER_HOST = 
SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+public static PropertyDescriptor ZK_CLIENT_TIMEOUT = 
SolrUtils.ZK_CLIENT_TIMEOUT;
+public static PropertyDescriptor ZK_CONNECTION_TIMEOUT = 
SolrUtils.ZK_CONNECTION_TIMEOUT;
--- End diff --

This also affects all JUnit tests for GetSolr and PutSolrContentStream. E. 
g. all lines like
runner.setProperty(GetSolr.SOLR_TYPE, GetSolr.SOLR_TYPE_CLOUD.getValue());
have to be changed...


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-30 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2285#discussion_r159122745
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java
 ---
@@ -34,39 +31,33 @@
 import org.apache.nifi.processor.util.StandardValidators;
 import org.apache.nifi.ssl.SSLContextService;
 import org.apache.solr.client.solrj.SolrClient;
-import org.apache.solr.client.solrj.impl.CloudSolrClient;
-import org.apache.solr.client.solrj.impl.HttpClientUtil;
-import org.apache.solr.client.solrj.impl.HttpSolrClient;
 import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;
-import org.apache.solr.common.params.ModifiableSolrParams;
 
-import javax.net.ssl.SSLContext;
 import javax.security.auth.login.Configuration;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
-import java.util.concurrent.TimeUnit;
 
 /**
  * A base class for processors that interact with Apache Solr.
  *
  */
 public abstract class SolrProcessor extends AbstractProcessor {
 
-public static final AllowableValue SOLR_TYPE_CLOUD = new 
AllowableValue(
-"Cloud", "Cloud", "A SolrCloud instance.");
-
-public static final AllowableValue SOLR_TYPE_STANDARD = new 
AllowableValue(
-"Standard", "Standard", "A stand-alone Solr instance.");
-
-public static final PropertyDescriptor SOLR_TYPE = new 
PropertyDescriptor
-.Builder().name("Solr Type")
-.description("The type of Solr instance, Cloud or Standard.")
-.required(true)
-.allowableValues(SOLR_TYPE_CLOUD, SOLR_TYPE_STANDARD)
-.defaultValue(SOLR_TYPE_STANDARD.getValue())
-.build();
+// make PropertyDescriptors of SolrUtils visible for classes extending 
SolrProcessor
+public static AllowableValue SOLR_TYPE_CLOUD = 
SolrUtils.SOLR_TYPE_CLOUD;
+public static AllowableValue SOLR_TYPE_STANDARD = 
SolrUtils.SOLR_TYPE_STANDARD;
+public static PropertyDescriptor SOLR_TYPE = SolrUtils.SOLR_TYPE;
+public static PropertyDescriptor COLLECTION = SolrUtils.COLLECTION;
+public static PropertyDescriptor JAAS_CLIENT_APP_NAME = 
SolrUtils.JAAS_CLIENT_APP_NAME;
+public static PropertyDescriptor SSL_CONTEXT_SERVICE = 
SolrUtils.SSL_CONTEXT_SERVICE;
+public static PropertyDescriptor SOLR_SOCKET_TIMEOUT = 
SolrUtils.SOLR_SOCKET_TIMEOUT;
+public static PropertyDescriptor SOLR_CONNECTION_TIMEOUT = 
SolrUtils.SOLR_CONNECTION_TIMEOUT;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS = 
SolrUtils.SOLR_MAX_CONNECTIONS;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS_PER_HOST = 
SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+public static PropertyDescriptor ZK_CLIENT_TIMEOUT = 
SolrUtils.ZK_CLIENT_TIMEOUT;
+public static PropertyDescriptor ZK_CONNECTION_TIMEOUT = 
SolrUtils.ZK_CONNECTION_TIMEOUT;
--- End diff --

I suggest to leave it as it is...


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-30 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2285#discussion_r159122676
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java
 ---
@@ -34,39 +31,33 @@
 import org.apache.nifi.processor.util.StandardValidators;
 import org.apache.nifi.ssl.SSLContextService;
 import org.apache.solr.client.solrj.SolrClient;
-import org.apache.solr.client.solrj.impl.CloudSolrClient;
-import org.apache.solr.client.solrj.impl.HttpClientUtil;
-import org.apache.solr.client.solrj.impl.HttpSolrClient;
 import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;
-import org.apache.solr.common.params.ModifiableSolrParams;
 
-import javax.net.ssl.SSLContext;
 import javax.security.auth.login.Configuration;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
-import java.util.concurrent.TimeUnit;
 
 /**
  * A base class for processors that interact with Apache Solr.
  *
  */
 public abstract class SolrProcessor extends AbstractProcessor {
 
-public static final AllowableValue SOLR_TYPE_CLOUD = new 
AllowableValue(
-"Cloud", "Cloud", "A SolrCloud instance.");
-
-public static final AllowableValue SOLR_TYPE_STANDARD = new 
AllowableValue(
-"Standard", "Standard", "A stand-alone Solr instance.");
-
-public static final PropertyDescriptor SOLR_TYPE = new 
PropertyDescriptor
-.Builder().name("Solr Type")
-.description("The type of Solr instance, Cloud or Standard.")
-.required(true)
-.allowableValues(SOLR_TYPE_CLOUD, SOLR_TYPE_STANDARD)
-.defaultValue(SOLR_TYPE_STANDARD.getValue())
-.build();
+// make PropertyDescriptors of SolrUtils visible for classes extending 
SolrProcessor
+public static AllowableValue SOLR_TYPE_CLOUD = 
SolrUtils.SOLR_TYPE_CLOUD;
+public static AllowableValue SOLR_TYPE_STANDARD = 
SolrUtils.SOLR_TYPE_STANDARD;
+public static PropertyDescriptor SOLR_TYPE = SolrUtils.SOLR_TYPE;
+public static PropertyDescriptor COLLECTION = SolrUtils.COLLECTION;
+public static PropertyDescriptor JAAS_CLIENT_APP_NAME = 
SolrUtils.JAAS_CLIENT_APP_NAME;
+public static PropertyDescriptor SSL_CONTEXT_SERVICE = 
SolrUtils.SSL_CONTEXT_SERVICE;
+public static PropertyDescriptor SOLR_SOCKET_TIMEOUT = 
SolrUtils.SOLR_SOCKET_TIMEOUT;
+public static PropertyDescriptor SOLR_CONNECTION_TIMEOUT = 
SolrUtils.SOLR_CONNECTION_TIMEOUT;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS = 
SolrUtils.SOLR_MAX_CONNECTIONS;
+public static PropertyDescriptor SOLR_MAX_CONNECTIONS_PER_HOST = 
SolrUtils.SOLR_MAX_CONNECTIONS_PER_HOST;
+public static PropertyDescriptor ZK_CLIENT_TIMEOUT = 
SolrUtils.ZK_CLIENT_TIMEOUT;
+public static PropertyDescriptor ZK_CONNECTION_TIMEOUT = 
SolrUtils.ZK_CONNECTION_TIMEOUT;
--- End diff --

Hi, thx for response. Changing this to static imports means that several 
PropertyDescriptors have to be imported in classes extending SolrProcessor as 
well. My idea was to add the PropertyDescriptors as values that they can be 
used in the same way they were used before. However, I can change this.


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-12-03 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2285#discussion_r154523361
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrUtils.java
 ---
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nifi.processors.solr;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.http.client.HttpClient;
+import org.apache.http.conn.scheme.Scheme;
+import org.apache.http.conn.ssl.SSLSocketFactory;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.context.PropertyContext;
+import org.apache.nifi.processor.io.OutputStreamCallback;
+import org.apache.nifi.serialization.record.ListRecordSet;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.nifi.ssl.SSLContextService;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.impl.CloudSolrClient;
+import org.apache.solr.client.solrj.impl.HttpClientUtil;
+import org.apache.solr.client.solrj.impl.HttpSolrClient;
+import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.util.ClientUtils;
+import org.apache.solr.common.SolrDocument;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ModifiableSolrParams;
+
+import javax.net.ssl.SSLContext;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+public class SolrUtils {
--- End diff --

@bbende any other comments? As soon as this pull request is merged, I can 
create a pull request for NIFI-4516


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-11-28 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2285#discussion_r153454150
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrUtils.java
 ---
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nifi.processors.solr;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.http.client.HttpClient;
+import org.apache.http.conn.scheme.Scheme;
+import org.apache.http.conn.ssl.SSLSocketFactory;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.context.PropertyContext;
+import org.apache.nifi.processor.io.OutputStreamCallback;
+import org.apache.nifi.serialization.record.ListRecordSet;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.RecordSet;
+import org.apache.nifi.ssl.SSLContextService;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.impl.CloudSolrClient;
+import org.apache.solr.client.solrj.impl.HttpClientUtil;
+import org.apache.solr.client.solrj.impl.HttpSolrClient;
+import org.apache.solr.client.solrj.impl.Krb5HttpClientConfigurer;
+import org.apache.solr.client.solrj.response.QueryResponse;
+import org.apache.solr.client.solrj.util.ClientUtils;
+import org.apache.solr.common.SolrDocument;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ModifiableSolrParams;
+
+import javax.net.ssl.SSLContext;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+public class SolrUtils {
--- End diff --

Kojis idea was to define createSolrClient as non-static to facilitate 
unit-testing. I now have defined everything as static.


---


[GitHub] nifi issue #2285: NIFI-4583 Restructure nifi-solr-processors

2017-11-21 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2285
  
(tested both processors in local build)


---


[GitHub] nifi issue #2285: NIFI-4583 Restructure nifi-solr-processors

2017-11-21 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2285
  
Restructured the methods as discussed in JIRA. Additionally, I improved the 
unit tests for GetSolr


---


[GitHub] nifi pull request #2285: NIFI-4583 Restructure nifi-solr-processors

2017-11-21 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2285

NIFI-4583 Restructure nifi-solr-processors

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-4583

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2285.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2285


commit 9478079d1bbbec5109a01f1d50431d7f06111928
Author: U-WOODMARK\johannes.peter 
Date:   2017-11-21T15:38:18Z

NIFI-4583 Restructure nifi-solr-processors




---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-20 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145946147
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,72 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.CLUSTER}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
 public class GetSolr extends SolrProcessor {
 
-public static final PropertyDescriptor SOLR_QUERY = new 
PropertyDescriptor
-.Builder().name("Solr Query")
-.description("A query to execute against Solr")
+public static final String STATE_MANAGER_FILTER = 
"stateManager_filter";
+public static final String STATE_MANAGER_CURSOR_MARK = 
"stateManager_cursorMark";
+public static final AllowableValue MODE_XML = new 
AllowableValue("XML");
+public static final AllowableValue MODE_REC = new 
AllowableValue("Records");
--- End diff --

Hmm, and dynamic fields could become a problem... I think this is not 
possible.


---


[GitHub] nifi issue #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-19 Thread JohannesDaniel
Github user JohannesDaniel commented on the issue:

https://github.com/apache/nifi/pull/2199
  
I dont think that the timezone and the commit issues are still important. 
GetSolr now takes the timestamp directly from the results. Commit delays wont 
be a problem as the state is only updated when new documents are retrieved, no 
matter at which time the query was executed. The same applies to the timezone 
issue.


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-19 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145727845
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -170,159 +210,213 @@ protected void init(final 
ProcessorInitializationContext context) {
 return this.descriptors;
 }
 
+final static Set propertyNamesForActivatingClearState = new 
HashSet();
+static {
+propertyNamesForActivatingClearState.add(SOLR_TYPE.getName());
+propertyNamesForActivatingClearState.add(SOLR_LOCATION.getName());
+propertyNamesForActivatingClearState.add(COLLECTION.getName());
+propertyNamesForActivatingClearState.add(SOLR_QUERY.getName());
+propertyNamesForActivatingClearState.add(DATE_FIELD.getName());
+propertyNamesForActivatingClearState.add(RETURN_FIELDS.getName());
+}
+
 @Override
 public void onPropertyModified(PropertyDescriptor descriptor, String 
oldValue, String newValue) {
-lastEndDatedRef.set(UNINITIALIZED_LAST_END_DATE_VALUE);
+if 
(propertyNamesForActivatingClearState.contains(descriptor.getName()))
+clearState.set(true);
 }
 
-@OnStopped
-public void onStopped() {
-writeLastEndDate();
-}
+@OnScheduled
+public void clearState(final ProcessContext context) throws 
IOException {
+if (clearState.getAndSet(false)) {
+context.getStateManager().clear(Scope.CLUSTER);
+final Map newStateMap = new 
HashMap();
 
-@OnRemoved
-public void onRemoved() {
-final File lastEndDateCache = new File(FILE_PREFIX + 
getIdentifier());
-if (lastEndDateCache.exists()) {
-lastEndDateCache.delete();
-}
-}
+newStateMap.put(STATE_MANAGER_CURSOR_MARK, "*");
 
-@Override
-public void onTrigger(ProcessContext context, ProcessSession session) 
throws ProcessException {
-final ComponentLog logger = getLogger();
-readLastEndDate();
-
-final SimpleDateFormat sdf = new 
SimpleDateFormat(LAST_END_DATE_PATTERN, Locale.US);
-sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
-final String currDate = sdf.format(new Date());
-
-final boolean initialized = 
!UNINITIALIZED_LAST_END_DATE_VALUE.equals(lastEndDatedRef.get());
-
-final String query = context.getProperty(SOLR_QUERY).getValue();
-final SolrQuery solrQuery = new SolrQuery(query);
-solrQuery.setRows(context.getProperty(BATCH_SIZE).asInteger());
-
-// if initialized then apply a filter to restrict results from the 
last end time til now
-if (initialized) {
-StringBuilder filterQuery = new StringBuilder();
-filterQuery.append(context.getProperty(DATE_FIELD).getValue())
-.append(":{").append(lastEndDatedRef.get()).append(" 
TO ")
-.append(currDate).append("]");
-solrQuery.addFilterQuery(filterQuery.toString());
-logger.info("Applying filter query {}", new 
Object[]{filterQuery.toString()});
-}
+final String initialDate = 
context.getProperty(DATE_FILTER).getValue();
+if (StringUtils.isBlank(initialDate))
+newStateMap.put(STATE_MANAGER_FILTER, "*");
+else
+newStateMap.put(STATE_MANAGER_FILTER, initialDate);
 
-final String returnFields = 
context.getProperty(RETURN_FIELDS).getValue();
-if (returnFields != null && !returnFields.trim().isEmpty()) {
-for (String returnField : returnFields.trim().split("[,]")) {
-solrQuery.addField(returnField.trim());
-}
+context.getStateManager().setState(newStateMap, Scope.CLUSTER);
+
+id_field = null;
 }
+}
 
-final String fullSortClause = 
context.getProperty(SORT_CLAUSE).getValue();
-if (fullSortClause != null && !fullSortClause.trim().isEmpty()) {
-for (String sortClause : fullSortClause.split("[,]")) {
-String[] sortParts = sortClause.trim().split("[ ]");
-solrQuery.addSort(sortParts[0], 
SolrQuery.ORDER.valueOf(sortParts[1]));
-}
+@Override
+protected final Collection 
additionalCustomValidation(ValidationContext context) {
+final Collection problems = new ArrayList<>();
+
+if 
(context.getProperty(RETUR

[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-19 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145721674
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,72 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.CLUSTER}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
 public class GetSolr extends SolrProcessor {
 
-public static final PropertyDescriptor SOLR_QUERY = new 
PropertyDescriptor
-.Builder().name("Solr Query")
-.description("A query to execute against Solr")
+public static final String STATE_MANAGER_FILTER = 
"stateManager_filter";
+public static final String STATE_MANAGER_CURSOR_MARK = 
"stateManager_cursorMark";
+public static final AllowableValue MODE_XML = new 
AllowableValue("XML");
+public static final AllowableValue MODE_REC = new 
AllowableValue("Records");
+
+public static final PropertyDescriptor RETURN_TYPE = new 
PropertyDescriptor
+.Builder().name("Return Type")
+.displayName("Return Type")
--- End diff --

The most properties were already available in the prior GetSol processor. I 
expected this to be critical for backwards compatibility. For the new 
properties I chose the same naming pattern.


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-19 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145720938
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,72 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.CLUSTER}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
 public class GetSolr extends SolrProcessor {
 
-public static final PropertyDescriptor SOLR_QUERY = new 
PropertyDescriptor
-.Builder().name("Solr Query")
-.description("A query to execute against Solr")
+public static final String STATE_MANAGER_FILTER = 
"stateManager_filter";
+public static final String STATE_MANAGER_CURSOR_MARK = 
"stateManager_cursorMark";
+public static final AllowableValue MODE_XML = new 
AllowableValue("XML");
+public static final AllowableValue MODE_REC = new 
AllowableValue("Records");
--- End diff --

Additionally, this requires parsing of response json, as the response 
parsing of Schema API is not really realized in SolrJ


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-19 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145719121
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,72 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.CLUSTER}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
 public class GetSolr extends SolrProcessor {
 
-public static final PropertyDescriptor SOLR_QUERY = new 
PropertyDescriptor
-.Builder().name("Solr Query")
-.description("A query to execute against Solr")
+public static final String STATE_MANAGER_FILTER = 
"stateManager_filter";
+public static final String STATE_MANAGER_CURSOR_MARK = 
"stateManager_cursorMark";
+public static final AllowableValue MODE_XML = new 
AllowableValue("XML");
+public static final AllowableValue MODE_REC = new 
AllowableValue("Records");
--- End diff --

The difficulty with this is that Solr provides various different field 
types for different kinds of data. For instance, an integer could be derived 
from an Int, TrieInt (version < 7.0) or Pint (version >= 7.0) field. This 
requires a comprehensive fieldtype-datatype mapping.


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-19 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145712892
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,72 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.CLUSTER}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
 public class GetSolr extends SolrProcessor {
 
-public static final PropertyDescriptor SOLR_QUERY = new 
PropertyDescriptor
-.Builder().name("Solr Query")
-.description("A query to execute against Solr")
+public static final String STATE_MANAGER_FILTER = 
"stateManager_filter";
+public static final String STATE_MANAGER_CURSOR_MARK = 
"stateManager_cursorMark";
+public static final AllowableValue MODE_XML = new 
AllowableValue("XML");
+public static final AllowableValue MODE_REC = new 
AllowableValue("Records");
--- End diff --

Principally yes, by using the Schema API. But I dont expect this to be too 
easy. I suggest that we create a separate ticket for this as it should require 
some deeper considerations.


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145612359
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -126,6 +126,14 @@
 .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
 .build();
 
+public static final PropertyDescriptor DATE_FILTER = new 
PropertyDescriptor
+.Builder().name("Initial Date Filter")
+.displayName("Initial Date Filter")
+.description("Date value to filter results. Documents with an 
earlier date will not be fetched. The format has to correspond to the date 
pattern of Solr '-MM-DDThh:mm:ssZ'")
+.required(false)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
--- End diff --

This property should make it quite obvious, how backwards compatibility can 
be achieved. Additionally, I will describe it in the documentation. BTW: Where 
can I change descriptions of processor usage? Did not find them in folder 
nifi-docs...


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145415068
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java
 ---
@@ -275,7 +275,7 @@ protected final boolean isBasicAuthEnabled() {
 }
 
 @Override
-protected final Collection 
customValidate(ValidationContext context) {
+protected Collection 
customValidate(ValidationContext context) {
--- End diff --

ok. by doing so, i will also have to add this method to PutSolrContentStream


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145408461
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/test/resources/solr/testCollection/conf/schema.xml
 ---
@@ -16,6 +16,16 @@
 
 
 
+
 
 
+
+
+
+
+
+
+
+id
--- End diff --

the uniqueKey field has to be part of the sorting. Well-configured Solr 
indexes always include this kind of field as many things will not work properly 
without this field. Actually, I have never seen a Solr index without this (and 
I have seen a lot ... ;). 


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145405902
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java
 ---
@@ -275,7 +275,7 @@ protected final boolean isBasicAuthEnabled() {
 }
 
 @Override
-protected final Collection 
customValidate(ValidationContext context) {
+protected Collection 
customValidate(ValidationContext context) {
--- End diff --

I did within class GetSolr


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145404225
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -172,157 +203,196 @@ protected void init(final 
ProcessorInitializationContext context) {
 
 @Override
 public void onPropertyModified(PropertyDescriptor descriptor, String 
oldValue, String newValue) {
-lastEndDatedRef.set(UNINITIALIZED_LAST_END_DATE_VALUE);
+clearState.set(true);
--- End diff --

ok, no problem


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145404160
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -138,10 +168,11 @@ protected void init(final 
ProcessorInitializationContext context) {
 descriptors.add(SOLR_TYPE);
 descriptors.add(SOLR_LOCATION);
 descriptors.add(COLLECTION);
+descriptors.add(RETURN_TYPE);
+descriptors.add(RECORD_WRITER);
 descriptors.add(SOLR_QUERY);
-descriptors.add(RETURN_FIELDS);
-descriptors.add(SORT_CLAUSE);
--- End diff --

This should be save as the sorting only affects documents indexed after 
lastEndDate (documents indexed earlier are excluded by filter query)


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145403786
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,64 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.LOCAL}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
--- End diff --

Sorry, this would be the correct filter query:
 fq=dateField:[lastEndDate TO NOW]


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-18 Thread JohannesDaniel
Github user JohannesDaniel commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2199#discussion_r145403415
  
--- Diff: 
nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java
 ---
@@ -66,42 +79,64 @@
 import org.apache.solr.common.SolrDocument;
 import org.apache.solr.common.SolrDocumentList;
 import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.CursorMarkParams;
 
-@Tags({"Apache", "Solr", "Get", "Pull"})
+@Tags({"Apache", "Solr", "Get", "Pull", "Records"})
 @InputRequirement(Requirement.INPUT_FORBIDDEN)
-@CapabilityDescription("Queries Solr and outputs the results as a 
FlowFile")
+@CapabilityDescription("Queries Solr and outputs the results as a FlowFile 
in the format of XML or using a Record Writer")
+@Stateful(scopes = {Scope.LOCAL}, description = "Stores latest date of 
Date Field so that the same data will not be fetched multiple times.")
--- End diff --

Do you really think that it is required to read the file? Backwards 
compatibility could also be realized by adding a filter query like 
fq=dateField:[* TO lastEndDate]. The user only had to specify the value of 
lastEndDate e. g. to an property of the processor.


---


[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor

2017-10-06 Thread JohannesDaniel
GitHub user JohannesDaniel opened a pull request:

https://github.com/apache/nifi/pull/2199

NIFI-3248: Improvement of GetSolr Processor

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [ ] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [ ] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [ ] Have you written or updated unit tests to verify your changes?
- [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [ ] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [ ] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JohannesDaniel/nifi NIFI-3248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2199.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2199


commit 8a5f7e54edc5640655edd19f15d22fada6ca9900
Author: JohannesDaniel 
Date:   2017-10-05T20:57:53Z

NIFI-3248: Improvement of GetSolr Processor




---


  1   2   >