Hello,

This is just before the docfest, but volunteers from the OpenChain Japan WG had 
briefly reviewed the SPDX files output from some tools, so let me share them 
with you.

I am looking forward to join and meet you  the DocFest!

Best refgards,
Tak

;;; 8<---8=---8<---8=---8<---8=---8<---8=---8<---8=---8<---8=---

# Observation notes on SPDX
OpenChain Japan WG (Summer, 2021)

## Preliminary remarks

Referring to NTIA's Plugfests, some volunteers from OpenChain Japan WG had 
briefly reviewed SPDX files output from some tools.
We would like to share our observations with the SPDX community.

## SPDX samples

### # 1

```
  Format: SPDX RDF format
  Tool: a commercial solution
  Files:
    nerves_examples (size: 298KB)
    node-express-realworld-example-app (SPDX file size: 33KB)
    time_1.9 (size: 249KB)
    zephyr-v2.6.0 (size: 23771KB)
```

### # 2

```
  Format: SPDX TAG:Value format
  Tool: FOSSlogy
    Version: [3.8.0], Branch: [master], Commit: [#fc054b] 2020/06/17 05:52 UTC 
built @ 2020/06/17 06:35 UTC
    - nomos ("3.8.0-64-gfc054be41".fc054b)
    - monk ("3.8.0-64-gfc054be41".fc054b)
    - ojo ("3.8.0-64-gfc054be41".fc054b)
  Files:
    blinky.ex  (size: 3KB)
    nerves_examples (size: 83KB)
    node-express-realworld-example-app (size: 10KB)
    time-1.9.tar.gz (size: 107KB)
```

## Observations

(The following are opinions and do not imply the conclusions of the discussion 
participants.)

1. File Size.
  The file size of a created SPDX is different depending on the tools, even 
though a target file is the same. For example, the file size may become large 
because of many "RELATIONSHIP" entries. We could be aware of the differences 
before looking into SPDX file contents. "File size of SPDX" may be a candidate 
to evaluate SPDX files.

1. Two fortmats.
  The default formats, such Tag-Value and RDF are different. However, we did 
not care about the format types, so that there were two formats at our 
discussiond, even though we had only two tools. Therefore, we could not 
evaluate SPDX files precisely.  (notes: There was a lot of error output from 
the converter, and we could not take the time to format the data.)

1. Build information.
  The condition to build software can affect "COCLUDED LICENSE". Further 
consideration is needed.

1. "PACKAGE INFORMATION" making.
  A tool did not create "PACKAGE INFORMATION" in a SPDX file, even though the 
input was an archived file, such as zip and tar. In the created SPDX file, 
there were many "FILE INFORMATION" and this caused difficulty for us to analyze 
SPDX files.

1. "PACKAGE INFORMATION" making.
  Using information from package manager or a dictionary based on the file's 
hash value might be nice if tool automatically complements "PACKAGE 
INFORMATION."

1. License Decision Policy.
  An expression of the license decision policy that can be used universally in 
any tool might be nice. (i.e. We did not use the automatic concluded license 
decide function of FOSSology, so that "CONCLUDED LICENSE" is "NOASSERTION" in 
sample2. The quality of clearing was not the point of this discussion.)

1. "CONCLUDED LICENSE" making.
  Whether to consider the license file in the top directory as a basis for 
determining "DECLARED LICENSE."  "CONCLUDED LICESE" case is needed to be 
considered by the community and needs consensus. (i.e. In FOSSology, "time1.9" 
has a COPYING file(GPL v3) at the top directory, but in the created SPDX files 
by FOSSology "CONCLUDED LICENSE" and "DECLARED LICENSE" are NONASSERTION.)


1. Some information is lost during the scan phase.
  FOSSology, for example, has the ability to scan by specifying a URL.
  On the other hand, a commercial tool can only scan archived source code. 
Therefore, if we receive the source code from the repository from a supplier, 
```3.7 Package Download Location``` is no longer available in a SPDX file.

1. Difficult to partially scan the git repository.
  For example, if we want to scan only [zephyr's 
hello_world](https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/hello_world),
 it is difficult to retrieve and scan part of the repositories.
  We can now use git sparse checkout, but I don't know what tools it supports.

1. Differences between RDF and Tag-Value
  Unrepresentable tags exist and different tools have different specifications.
  For example, Creator in RDF is illustrated in the specification as follows:
   ```
   <CreationInfo>
     <creator> Person: Jane Doe () </creator>
     <creator> Organization: ExampleCodeInspect () </creator>
     <creator> Tool: LicenseFind-1.0 </creator>
   </CreationInfo>
   ```
   On the other hand, in the case of expressing in TagValue, since there is no 
CreationInfo Tag, in FOSSology, for example, it is expressed as a comment as 
follows:
   ```
   ##-------------------------
   ## Creation Information
   ##-------------------------
   Creator: Tool: spdx2
   Creator: Person: name ()
   ```
   These displays depend on the tool.

1. Handling ```3 Package Information```
  In the SPDX specification, ```3 Package Information``` is Optional. On the 
other hand, ```3.13 ConcludedLicense``` is mandatory.
  Also, in ```4 File Information```:
   ```
   Starting with SPDX 2.0, it is not necessary to have a package wrapping a set 
of files.
   ```
   Therefore, Some tools does not always provide any ```3 Package 
Information```.

1. with respect to the sorting of License names
  We always see that the same license has various expressions such as GPL, GPL 
3.0, GPL-3.0, etc.. It is difficult to compare them because there is no ways to 
name them to correct SPDX identifier such as ```GPL-3.0-only``` except for the 
```ConcludedLicense```.
  And as the result of LicenseRef-XXXX increases, the file size increases.

  1. We re-understood the importance of validation tool for a SPDX file.


## Volunteers
The following persons are members of LicenseInfo SubGroup or Tooling SubGroup 
of OpenChain Japan WG.

###  SPDX samples
Sample1: Yoshitaka Nishio
Sample2: Norio Kobota

### Discussion

Hirotaka Motai, Hiroyuki Fukuchi, Hiroyuki Ishihara, Jumpei Kiyotoki, Kouki 
Hama, Masaki Ambai, Norio Kobota, Satoru Koizumi, Shi Qiu, Shinsuke Kato, 
Tadayuki Osaki, Takashi Ninjouji, Teppei Asaba, Tomo Dote, Yoshitaka Nishio, 
Yoshiyuki Ito.

--- End of file ---

;;; 8<---8=---8<---8=---8<---8=---8<---8=---8<---8=---8<---8=---



---

Takashi NINJOUJI



Chief Specialist

Advanced Collaborative Software Development and Technology Department

Corporate Software Engineering & Technology Center



Toshiba Corporation

1 Komukai-toshiba-cho, Saiwai-ku Kawasaki-shi, Kanagawa 212-8582, Japan

email: takashi1.ninjo...@toshiba.co.jp

www: http://www.toshiba.co.jp/worldwide/





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#4186): https://lists.spdx.org/g/Spdx-tech/message/4186
Mute This Topic: https://lists.spdx.org/mt/85638272/21656
Group Owner: spdx-tech+ow...@lists.spdx.org
Unsubscribe: https://lists.spdx.org/g/Spdx-tech/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Attachment: Observations_on_SPDX_by_OpenChain-JWG.md
Description: Observations_on_SPDX_by_OpenChain-JWG.md

Reply via email to