Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-26 Thread Kyotaro HORIGUCHI
Hi,

At Sun, 25 Jun 2017 09:20:10 +0900 (JST), Tatsuo Ishii  
wrote in <20170625.092010.542143642647288693.t-is...@sraoss.co.jp>
> > I don't have a clear opinion on this particular issue, but I think we
> > should have clarity on why particular files or code exist.  So why do
> > these files exist and the others don't?  Is it just the license?
> 
> I think so.
> 
> Many of those files are from http://ftp.unicode.org. There's no
> license description there, so I think we should not copy those files
> for safety reason. (I vaguely recall that they explicitly prohibited
> to distribute the files before but I could no find such a statement at
> this moment).

The license for the files is seen in "EXHIBIT 1" in the following URL.

http://www.unicode.org/copyright.html

Roughly it claims that the copied files or software containing
the copy of thefiles should be accompanied by the same copyright
notice, or it should be seen in associated documentation. So we
could contain the files by adding some notice but fially we
decide not to contain them in the repository, I think

> gb-18030-2000.xml and windows-949-2000.xml are from
> https://ssl.icu-project.org/. I do not know what licenses those files
> use (maybe Apache).
> 
> Regarding euc-jis-2004-std.txt and sjis-0213-2004-std.txt are from
> http://x0213.org. The license are described in the files.

I'm not intending to insisnt on removing them if someone strongly
wants to preserve them, since their existence don't harm
anything.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-24 Thread Tatsuo Ishii
> I don't have a clear opinion on this particular issue, but I think we
> should have clarity on why particular files or code exist.  So why do
> these files exist and the others don't?  Is it just the license?

I think so.

Many of those files are from http://ftp.unicode.org. There's no
license description there, so I think we should not copy those files
for safety reason. (I vaguely recall that they explicitly prohibited
to distribute the files before but I could no find such a statement at
this moment).

gb-18030-2000.xml and windows-949-2000.xml are from
https://ssl.icu-project.org/. I do not know what licenses those files
use (maybe Apache).

Regarding euc-jis-2004-std.txt and sjis-0213-2004-std.txt are from
http://x0213.org. The license are described in the files.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-23 Thread Peter Eisentraut
On 6/23/17 11:15, Tatsuo Ishii wrote:
>> For clarity, I personally perfer to keep all the source text file
>> in the repository, especially so that we can detect changes of
>> them. But since we decide that at least most of them not to be
>> there (from a reason of license), I just don't see a reason to
>> keep only the rest even without the restriction.
> 
> So are you saying that if n/m of authority files are not kept because
> of license issue, then m-n authority files should not be kept as well?
> What's the benefit for us by doing so?

I don't have a clear opinion on this particular issue, but I think we
should have clarity on why particular files or code exist.  So why do
these files exist and the others don't?  Is it just the license?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-23 Thread Tatsuo Ishii
> For clarity, I personally perfer to keep all the source text file
> in the repository, especially so that we can detect changes of
> them. But since we decide that at least most of them not to be
> there (from a reason of license), I just don't see a reason to
> keep only the rest even without the restriction.

So are you saying that if n/m of authority files are not kept because
of license issue, then m-n authority files should not be kept as well?
What's the benefit for us by doing so?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Kyotaro HORIGUCHI
At Fri, 23 Jun 2017 10:04:26 +0900 (JST), Tatsuo Ishii  
wrote in <20170623.100426.157023025943107410.t-is...@sraoss.co.jp>
> > Hm. I am wondering about licensing issues here to keep those files in
> > the tree. I am no lawyer.
> 
> Of course. Regarding euc-jis-2004-std.txt and sjis-0213-2004-std.txt,
> it seems safe to keep them.
> 
> > ## Date: 13 May 2006
> > ## License:
> > ##  Copyright (C) 2001 earth...@tama.or.jp, All Rights Reserved.
> > ##  Copyright (C) 2001 I'O, All Rights Reserved.
> > ##  Copyright (C) 2006 Project X0213, All Rights Reserved.
> > ##  You can use, modify, distribute this table freely.
> 
> >> - It allows to track the changes in the original file if we decide to
> >>   change the map files.
> > 
> > You have done that in the past for a couple of codepoints, didn't you?
> 
> I believe the reason why I didn't keep other txt files were they were
> prohibited to have copies according to their license.

For clarity, I personally perfer to keep all the source text file
in the repository, especially so that we can detect changes of
them. But since we decide that at least most of them not to be
there (from a reason of license), I just don't see a reason to
keep only the rest even without the restriction.

> >> - The site http://x0213.org/ may disappear in the future. If that
> >>   happens, we will lose track data how we create the map files.
> > 
> > There are other problems then as there are 3 sites in use to fetch the data:
> > - GB2312.TXT comes from greenstone.org.
> > - Some from icu-project.org.
> > - The rest is from unicode.org.
> 
> Maybe, but I don't know how to deal with them.

Except for detecting changes, as mentioned upthread, in case of
necessity of authority files (why?) after losing the autority, we
can regenerate a linear mapping from a .map file. But I believe
that further change (that we should follow) will hardly come.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Tatsuo Ishii
> Hm. I am wondering about licensing issues here to keep those files in
> the tree. I am no lawyer.

Of course. Regarding euc-jis-2004-std.txt and sjis-0213-2004-std.txt,
it seems safe to keep them.

> ## Date: 13 May 2006
> ## License:
> ##Copyright (C) 2001 earth...@tama.or.jp, All Rights Reserved.
> ##Copyright (C) 2001 I'O, All Rights Reserved.
> ##Copyright (C) 2006 Project X0213, All Rights Reserved.
> ##You can use, modify, distribute this table freely.

>> - It allows to track the changes in the original file if we decide to
>>   change the map files.
> 
> You have done that in the past for a couple of codepoints, didn't you?

I believe the reason why I didn't keep other txt files were they were
prohibited to have copies according to their license.

>> - The site http://x0213.org/ may disappear in the future. If that
>>   happens, we will lose track data how we create the map files.
> 
> There are other problems then as there are 3 sites in use to fetch the data:
> - GB2312.TXT comes from greenstone.org.
> - Some from icu-project.org.
> - The rest is from unicode.org.

Maybe, but I don't know how to deal with them.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Michael Paquier
On Fri, Jun 23, 2017 at 9:39 AM, Tatsuo Ishii  wrote:
> I think we should keep the original .txt files because:

Hm. I am wondering about licensing issues here to keep those files in
the tree. I am no lawyer.

> - It allows to track the changes in the original file if we decide to
>   change the map files.

You have done that in the past for a couple of codepoints, didn't you?

> - The site http://x0213.org/ may disappear in the future. If that
>   happens, we will lose track data how we create the map files.

There are other problems then as there are 3 sites in use to fetch the data:
- GB2312.TXT comes from greenstone.org.
- Some from icu-project.org.
- The rest is from unicode.org.

> I believe we'd better to follow the same way how src/timezone keeps
> the original timezone data.
>
> Above reasoning will not valid if we have a way to reconstruct the
> original txt files from the map files, I doubt it's worth the
> trouble to create such tools however.

That's true as well. No need for reverse-engineering if there is no
reason to. That would be possible though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Tatsuo Ishii
>> Why do you believe so?
> 
> Unicode/Makefile includes that:
> euc-jis-2004-std.txt sjis-0213-2004-std.txt:
> $(DOWNLOAD) http://x0213.org/codetable/$(@F)
> 
> So those files ought to be downloaded when rebuilding the maps, and
> they should not be in the tree. In short, I think that Horiguchi-san
> is right. On top of the two pointed out by Horiguchi-san,
> gb-18030-2000.xml should not be in the tree.

I think we should keep the original .txt files because:

- It allows to track the changes in the original file if we decide to
  change the map files.

- The site http://x0213.org/ may disappear in the future. If that
  happens, we will lose track data how we create the map files.

I believe we'd better to follow the same way how src/timezone keeps
the original timezone data.

Above reasoning will not valid if we have a way to reconstruct the
original txt files from the map files, I doubt it's worth the
trouble to create such tools however.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Michael Paquier
On Fri, Jun 23, 2017 at 8:12 AM, Tatsuo Ishii  wrote:
>> Hi, I happned to notice that backend/utils/mb/Unicode directory
>> contains two encoding authority files, which I believe are not to
>> be there.

(Worked on that with Horiguchi-san a couple of weeks back.)

>> euc-jis-2004-std.txt
>> sjis-0213-2004-std.txt
>
> Why do you believe so?

Unicode/Makefile includes that:
euc-jis-2004-std.txt sjis-0213-2004-std.txt:
$(DOWNLOAD) http://x0213.org/codetable/$(@F)

So those files ought to be downloaded when rebuilding the maps, and
they should not be in the tree. In short, I think that Horiguchi-san
is right. On top of the two pointed out by Horiguchi-san,
gb-18030-2000.xml should not be in the tree.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Tatsuo Ishii
> Hi, I happned to notice that backend/utils/mb/Unicode directory
> contains two encoding authority files, which I believe are not to
> be there.
> 
> euc-jis-2004-std.txt
> sjis-0213-2004-std.txt

Why do you believe so?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shift_sjis_2004 related autority files are remaining

2017-06-22 Thread Robert Haas
On Fri, Apr 7, 2017 at 1:59 AM, Kyotaro HORIGUCHI
 wrote:
> Hi, I happned to notice that backend/utils/mb/Unicode directory
> contains two encoding authority files, which I believe are not to
> be there.
>
> euc-jis-2004-std.txt
> sjis-0213-2004-std.txt
>
> And what is more astonishing, make distclean didn't its work.
>
> | $ make distclean
> | rm -f
>
> The Makefile there is missing the defenition of TEXT.
>
> # Sorry for the bogus patch by me..
>
> The attached is the *first patch* that fixes distclean and adds
> the two files into GENERICTEXTS.
>
> =
>
> I don't attach the *second* patch since it's too large for the
> trivality and can be made by the following steps.
>
> $ cd src/backend/utils/mb/Unicode
> $ git rm *.txt
> $ git commit

I think you are right about all of this, although I am not an expert
in this area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] shift_sjis_2004 related autority files are remaining

2017-04-06 Thread Kyotaro HORIGUCHI
Hi, I happned to notice that backend/utils/mb/Unicode directory
contains two encoding authority files, which I believe are not to
be there.

euc-jis-2004-std.txt
sjis-0213-2004-std.txt

And what is more astonishing, make distclean didn't its work.

| $ make distclean
| rm -f 

The Makefile there is missing the defenition of TEXT.

# Sorry for the bogus patch by me..

The attached is the *first patch* that fixes distclean and adds
the two files into GENERICTEXTS.

=

I don't attach the *second* patch since it's too large for the
trivality and can be made by the following steps.

$ cd src/backend/utils/mb/Unicode
$ git rm *.txt
$ git commit


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 59c069baaee7a4125fe7071e999c9b2a9d0e40d2 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 7 Apr 2017 14:33:56 +0900
Subject: [PATCH 1/2] Fix distclean of src/backend/utils/mb/Unicode

The Makefile there is missing the target of make distclean.
This restores the defeinition.

diff --git a/src/backend/utils/mb/Unicode/Makefile b/src/backend/utils/mb/Unicode/Makefile
index 8f3afa0..c06b7a1 100644
--- a/src/backend/utils/mb/Unicode/Makefile
+++ b/src/backend/utils/mb/Unicode/Makefile
@@ -68,7 +68,9 @@ WINTEXTS = CP866.TXT CP874.TXT CP936.TXT \
 	CP1256.TXT CP1257.TXT CP1258.TXT
 
 GENERICTEXTS = $(ISO8859TEXTS) $(WINTEXTS) \
-	KOI8-R.TXT KOI8-U.TXT
+	KOI8-R.TXT KOI8-U.TXT sjis-0213-2004-std.txt euc-jis-2004-std.txt
+
+TEXTS = $(GENERICTEXTS) $(ISO8859TEXTS) $(WINTEXTS)
 
 all: $(MAPS)
 
-- 
2.9.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers