I wrote:


If it bothers you that much. I'd make a flag, cleared at the start of each COPY, and then where we test for CR or LF in CopyAttributeOutCSV, if the flag is not set then set it and issue the warning.



I didn't realise until Bruce told me just now that I was on the hook for this. I guess i should keep my big mouth shut. (Yeah, that's gonna happen ...)


Anyway, here's a tiny patch that does what I had in mind.

cheers

andrew
Index: copy.c
===================================================================
RCS file: /home/cvsmirror/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.234
diff -c -r1.234 copy.c
*** copy.c	6 Nov 2004 17:46:27 -0000	1.234
--- copy.c	2 Dec 2004 23:34:20 -0000
***************
*** 98,103 ****
--- 98,104 ----
  static EolType eol_type;		/* EOL type of input */
  static int	client_encoding;	/* remote side's character encoding */
  static int	server_encoding;	/* local encoding */
+ static bool embedded_line_warning;
  
  /* these are just for error messages, see copy_in_error_callback */
  static bool copy_binary;		/* is it a binary copy? */
***************
*** 1190,1195 ****
--- 1191,1197 ----
  	attr = tupDesc->attrs;
  	num_phys_attrs = tupDesc->natts;
  	attr_count = list_length(attnumlist);
+ 	embedded_line_warning = false;
  
  	/*
  	 * Get info about the columns we need to process.
***************
*** 2627,2632 ****
--- 2629,2653 ----
  		 !use_quote && (c = *test_string) != '\0';
  		 test_string += mblen)
  	{
+ 		/*
+ 		 * We don't know here what the surrounding line end characters
+ 		 * might be. It might not even be under postgres' control. So
+ 		 * we simple warn on ANY embedded line ending character.
+ 		 *
+ 		 * This warning will disappear when we make line parsing field-aware,
+ 		 * so that we can reliably read in embedded line ending characters
+ 		 * regardless of the file's line-end context.
+ 		 *
+ 		 */
+ 
+ 		if (!embedded_line_warning  && (c == '\n' || c == '\r') )
+ 		{
+ 			embedded_line_warning = true;
+ 			elog(WARNING,
+ 				 "CSV fields with embedded linefeed or carriage return "
+ 				 "characters might not be able to be reimported");
+ 		}
+ 
  		if (c == delimc || c == quotec || c == '\n' || c == '\r')
  			use_quote = true;
  		if (!same_encoding)
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to