DPAPI examples #
Note that this is a living document and the following is subject to change.
This page gives simple examples of the user written config.yaml file alongside the working config file generated by FAIR run. Note that the Data Pipeline API will take the working config file as an input.
Empty code run #
User written config.yaml #
run_metadata:
description: An empty code run
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/empty_script.R ${{CONFIG_DIR}}
Working config.yaml #
fair run should create a working config.yaml file, which is read by the Data Pipeline API. In this example, the working config.yaml file is pretty much identical to the original config.yaml file, only ${{CONFIG_DIR}} is replaced by the directory in which the working config.yaml file resides.
run_metadata:
description: An empty code run
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/empty_script.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
Submission script (R) #
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
finalise(handle)
Write data product (HDF5) #
User written config.yaml #
run_metadata:
description: Write an array
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/write_array.R ${{CONFIG_DIR}}
write:
- data_product: test/array
description: test array with simple data
Working config.yaml #
fdp run should create a working config.yaml file, which is read by the Data Pipeline API. In this example, the working config.yaml file is pretty much identical to the original config.yaml file, only ${{CONFIG_DIR}} is replaced by the directory in which the working config.yaml file resides.
run_metadata:
description: Write an array
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/write_array.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: true
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
write:
- data_product: test/array
description: test array with simple data
use:
version: 0.1.0
Note that, although use: is reserved for aliasing in the user-written config, for simplicity the CLI will always write version here.
Note also that by default, the CLI will write public: true to run_metadata:. The user is however free to specify public: false for individual writes.
Submission script (R) #
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
df <- data.frame(a = 1:2, b = 3:4)
rownames(df) <- 1:2
write_array(array = as.matrix(df),
handle = handle,
data_product = "test/array",
component = "component1/a/s/d/f/s",
description = "Some description",
dimension_names = list(rowvalue = rownames(df),
colvalue = colnames(df)),
dimension_values = list(NA, 10),
dimension_units = list(NA, "km"),
units = "s")
finalise(handle)
Read data product (HDF5) #
User written config.yaml #
run_metadata:
description: Read an array
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/read_array.R ${{CONFIG_DIR}}
read:
- data_product: test/array
Working config.yaml #
fdp run should create a working config.yaml file, which is read by the Data Pipeline API. In this example, the working config.yaml file is pretty much identical to the original config.yaml file, only ${{CONFIG_DIR}} is replaced by the directory in which the working config.yaml file resides.
run_metadata:
description: Read an array
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/read_array.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
read:
- data_product: test/array
use:
version: 0.1.0
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
data_product <- "test/array"
component <- "component1/a/s/d/f/s"
dat <- read_array(handle = handle,
data_product = data_product,
component = component)
finalise(handle)
Write data product (csv) #
User written config.yaml #
run_metadata:
description: Write csv file
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/write_csv.R ${{CONFIG_DIR}}
write:
- data_product: test/csv
description: test csv file with simple data
file_type: csv
Working config.yaml #
run_metadata:
description: Write csv file
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/write_csv.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: true
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
write:
- data_product: test/csv
description: test csv file with simple data
file_type: csv
use:
version: 0.0.1
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
df <- data.frame(a = 1:2, b = 3:4)
rownames(df) <- 1:2
path <- link_write(handle, "test/csv")
write.csv(df, path)
finalise(handle)
Read data product (csv) #
User written config.yaml #
run_metadata:
description: Read csv file
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/read_csv.R ${{CONFIG_DIR}}
read:
- data_product: test/csv
Working config.yaml #
run_metadata:
description: Read csv file
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/FAIRDataPipeline/FDP_validation/
script: |-
R -f simple_working_examples/read_csv.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
read:
- data_product: test/csv
use:
version: 0.0.1
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
path <- link_read(handle, "test/csv")
df <- read.csv(path)
finalise(handle)
Write data product (point estimate) #
User written config.yaml #
run_metadata:
description: Write point estimate
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/write_point_estimate.R ${{CONFIG_DIR}}
write:
- data_product: test/estimate/asymptomatic-period
description: asymptomatic period
Working config.yaml #
run_metadata:
description: Write point estimate
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/write_point_estimate.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: true
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
write:
- data_product: test/estimate/asymptomatic-period
description: asymptomatic period
use:
version: 0.0.1
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
write_estimate(value = 9,
handle = handle,
data_product = "test/distribution/asymptomatic-period",
component = "asymptomatic-period",
description = "asymptomatic period")
finalise(handle)
Read data product (point estimate) #
User written config.yaml #
run_metadata:
description: Read point estimate
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/read_point_estimate.R ${{CONFIG_DIR}}
read:
- data_product: test/estimate/asymptomatic-period
Working config.yaml #
run_metadata:
description: Read point estimate
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/read_point_estimate.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
read:
- data_product: test/estimate/asymptomatic-period
use:
version: 0.0.1
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
read_estimate(handle = handle,
data_product = "test/distribution/asymptomatic-period",
component = "asymptomatic-period")
finalise(handle)
Write data product (distribution) #
User written config.yaml #
run_metadata:
description: Write distribution
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/write_distribution.R ${{CONFIG_DIR}}
write:
- data_product: test/distribution/symptom-delay
description: Estimate of symptom delay
Working config.yaml #
run_metadata:
description: Write distribution
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/write_distribution.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: true
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
write:
- data_product: test/distribution/symptom-delay
description: Estimate of symptom delay
use:
version: 0.0.1
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
write_distribution(handle = handle,
data_product = "test/distribution/symptom-delay",
component = "symptom-delay",
distribution = "Gaussian",
parameters = list(mean = -16.08, SD = 30),
description = "symptom delay")
finalise(handle)
Read data product (distribution) #
User written config.yaml #
run_metadata:
description: Read distribution
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
script: |-
R -f simple_working_examples/read_distribution.R ${{CONFIG_DIR}}
read:
- data_product: test/distribution/symptom-delay
Working config.yaml #
run_metadata:
description: Read distribution
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata/
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
script: |-
R -f simple_working_examples/read_distribution.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/FAIRDataPipeline/FDP_validation
read:
- data_product: test/distribution/symptom-delay
use:
version: 0.0.1
Submission script (R) #
library(rDataPipeline)
# Open the connection to the local registry with a given config file
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
read_distribution(handle = handle,
data_product = "test/distribution/symptom-delay",
component = "symptom-delay")
finalise(handle)
Attach issue to component #
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R ${{CONFIG_DIR}}
write:
- data_product: test/array/issues/component
description: a test array
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: true
latest_commit: e3c0ebdf5ae079bd72f601ec5eefdf998c4fc8ec
remote_repo: https://github.com/fake_org/fake_repo
read: []
write:
- data_product: test/array/issues/component
description: a test array
use:
version: 0.1.0
Submission script (R) #
In R, we can attach issues to components in different ways. If there’s a more elegant way to do this, please tell me!
Attach an issue on the fly by referencing an index in the handle:
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
df <- data.frame(a = 1:2, b = 3:4)
rownames(df) <- 1:2
component_id <- write_array(array = as.matrix(df),
handle = handle,
data_product = "test/array/issues/component",
component = "component1/a/s/d/f/s",
description = "Some description",
dimension_names = list(rowvalue = rownames(df),
colvalue = colnames(df)),
dimension_values = list(NA, 10),
dimension_units = list(NA, "km"),
units = "s")
issue <- "some issue"
severity <- 7
raise_issue(index = component_id,
handle = handle,
issue = issue,
severity = severity)
finalise(handle)
Attaching an issue to a data product that already exists in the data registry by referencing it explicitly:
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
issue <- "some issue"
severity <- 7
raise_issue(handle = handle,
data_product = "test/array/issues/component",
component = "component1/a/s/d/f/s",
version = "0.1.0",
namespace = "username",
issue = issue,
severity = severity)
finalise(handle)
Attaching an issue to multiple components at the same time:
raise_issue(index = c(component_id1, component_id2),
handle = handle,
issue = issue,
severity = severity)
or
raise_issue(handle = handle,
data_product = "test/array/issues/component",
component = c("component1/a/s/d/f/s", "component2/a/s/d/f/s"),
version = "0.1.0",
namespace = "username",
issue = issue,
severity = severity)
Attach issue to whole data product #
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R ${{CONFIG_DIR}}
write:
- data_product: "test/array/issues/whole"
description: a test array
file_type: csv
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: true
latest_commit: 40725b40252fd55ba355f7ed66f5a42387f1674f
remote_repo: https://github.com/fake_org/fake_repo
read: []
write:
- data_product: test/array/issues/whole
description: a test array
file_type: csv
use:
version: 0.1.0
Submission script (R) #
In R, we can attach issues to data products in different ways. If there’s a more elegant way to do this, please tell me!
Attach an issue on the fly by referencing an index in the handle:
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
df <- data.frame(a = 1:2, b = 3:4)
rownames(df) <- 1:2
index <- write_array(array = as.matrix(df),
handle = handle,
data_product = "test/array/issues/whole",
component = "component1/a/s/d/f/s",
description = "Some description",
dimension_names = list(rowvalue = rownames(df),
colvalue = colnames(df)))
write_array(array = as.matrix(df),
handle = handle,
data_product = "test/array/issues/whole",
component = "component2/a/s/d/f/s",
description = "Some description",
dimension_names = list(rowvalue = rownames(df),
colvalue = colnames(df)))
issue <- "some issue"
severity <- 7
raise_issue(index = index,
handle = handle,
issue = issue,
severity = severity,
whole_object = TRUE)
finalise(handle)
Attaching an issue to a data product that already exists in the data registry by referencing it explicitly:
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
issue <- "some issue"
severity <- 7
raise_issue(handle = handle,
data_product = "test/array/issues/whole",
version = "0.1.0",
namespace = "username",
issue = issue,
severity = severity)
finalise(handle)
Attaching an issue to multiple data products at the same time:
raise_issue(index = c(index1, index2),
handle = handle,
issue = issue,
severity = severity,
whole_object = TRUE)
or
raise_issue(handle = handle,
data_product = c("test/array/issues/whole", "test/array/issues/whole/2"),
version = c("0.1.0", "0.1.0"),
namespace = "username",
issue = issue,
severity = severity)
Attach issue to config #
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R ${{CONFIG_DIR}}
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 0d98e732b77e62a6cd390c6aec655f260f5f9b33
remote_repo: https://github.com/fake_org/fake_repo
read: []
write: []
Submission script (R) #
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
config_issue <- "issue with config"
config_severity <- 7
raise_issue_config(handle = handle,
issue = config_issue,
severity = config_severity)
finalise(handle)
Attach issue to submission script #
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R ${{CONFIG_DIR}}
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 358f64c4044f3b3f761865ee8e9f4375cf41d155
remote_repo: https://github.com/fake_org/fake_repo
read: []
write: []
Submission script (R) #
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
script_issue <- "issue with script"
script_severity <- 7
raise_issue_script(handle = handle,
issue = script_issue,
severity = script_severity)
finalise(handle)
Attach issue to GitHub repository #
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R ${{CONFIG_DIR}}
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/attach_issue.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: 6b23ec822bfd7ea5f419c70ce18fb73b59c90754
remote_repo: https://github.com/fake_org/fake_repo
read: []
write: []
Submission script (R) #
library(rDataPipeline)
# Initialise Code Run
config <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "config.yaml")
script <- file.path(Sys.getenv("FDP_CONFIG_DIR"), "script.sh")
handle <- initialise(config, script)
repo_issue <- "issue with repo"
repo_severity <- 7
raise_issue_repo(handle = handle,
issue = repo_issue,
severity = repo_severity)
finalise(handle)
Attach issue to external object #
This is not something we want to do.
Attach issue to code run #
This might be something we want to do in the future, but not now.
Delete DataProduct (optionally) if identical to previous version #
Delete CodeRun (optionally) if nothing happened #
That is, if no output was created and no issue was raised
CodeRun with aliases (use block example)
#
User written config.yaml #
run_metadata:
description: A test model
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: SCRC
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata
script: |-
R -f inst/SCRC/scotgov_management/submission_script.R ${{CONFIG_DIR}}
read:
- data_product: test/data/alias
use:
namespace: johnsmith
data_product: scotland/human/population
write:
- data_product: human/outbreak-timeseries
description: data product description
use:
data_product: scotland/human/outbreak-timeseries
- data_product: human/outbreak/simulation_run
description: another data product description
use:
data_product: human/outbreak/simulation_run-${{RUN_ID}}
Working config.yaml #
fair run should create a working config.yaml file, which is read by the Data Pipeline API. In this example, the working config.yaml file is pretty much identical to the original config.yaml file, only ${{CONFIG_DIR}} is replaced by the directory in which the working config.yaml file resides.
run_metadata:
description: A test model
local_data_registry_url: https://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: soniamitchell
default_output_namespace: soniamitchell
write_data_store: /Users/SoniaM/datastore/
public: true
local_repo: /Users/Soniam/Desktop/git/SCRC/SCRCdata
latest_commit: 221bfe8b52bbfb3b2dbdc23037b7dd94b49aaa70
remote_repo: https://github.com/ScottishCovidResponse/SCRCdata
script: |-
R -f inst/SCRC/scotgov_management/submission_script.R /Users/SoniaM/datastore/coderun/20210511-231444/
read:
- data_product: human/population
use:
data_product: scotland/human/population
version: 0.1.0
namespace: johnsmith
write:
- data_product: human/outbreak-timeseries
description: data product description
use:
data_product: scotland/human/outbreak-timeseries
version: 0.1.0
- data_product: human/outbreak/simulation_run
description: another data product description
use:
data_product: human/outbreak/simulation_run-${{RUN_ID}}
version: 0.1.0
CodeRun with read globbing #
This example makes use of globbing in the read: block.
First we need to populate your local registry with something to read:
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/input_globbing.R ${{CONFIG_DIR}}
write:
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/1
description: A csv file
file_type: csv
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/thing/1
description: A csv file
file_type: csv
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/input_globbing.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: yes
latest_commit: 064e900b691e80058357a344f02cf73de0166fab
remote_repo: https://github.com/fake_org/fake_repo
read: []
write:
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/1
description: A csv file
file_type: csv
use:
version: 0.0.1
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/thing/1
description: A csv file
file_type: csv
use:
version: 0.0.1
Now that our local registry is populated, we can try globbing:
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/input_globbing.R ${{CONFIG_DIR}}
read:
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/*
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/input_globbing.R /Users/SoniaM/datastore/coderun/20210511-231444/
latest_commit: b9e2187b3796f06ca33f92c3a82863215917ed0e
remote_repo: https://github.com/fake_org/fake_repo
read:
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/thing/1
use:
version: 0.0.1
- data_product: real/data/1d06c1840618f1cd0ff29177b34fa68df939a9a8/1
use:
version: 0.0.1
write: []
CodeRun with write globbing #
This example makes use of globbing in the write: block.
First we need to populate your local registry with some data:
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/output_globbing.R ${{CONFIG_DIR}}
write:
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/1
description: A csv file
file_type: csv
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/thing/1
description: A csv file
file_type: csv
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/output_globbing.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: yes
latest_commit: 2a8688677321b99e3a2545ce020992d136334b71
remote_repo: https://github.com/fake_org/fake_repo
read: []
write:
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/1
description: A csv file
file_type: csv
use:
version: 0.0.1
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/thing/1
description: A csv file
file_type: csv
use:
version: 0.0.1
Now that our local registry is populated, we can try globbing:
User written config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/output_globbing.R ${{CONFIG_DIR}}
write:
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/*
description: A csv file
file_type: csv
use:
version: ${{MAJOR}}
Working config.yaml #
run_metadata:
description: Register a file in the pipeline
local_data_registry_url: http://localhost:8000/api/
remote_data_registry_url: https://data.fairdatapipeline.org/api/
default_input_namespace: username
default_output_namespace: username
write_data_store: /Users/username/datastore/
local_repo: local_repo
script: |-
R -f simple_working_examples/output_globbing.R /Users/SoniaM/datastore/coderun/20210511-231444/
public: yes
latest_commit: f95815976cd4d93c062f94a48525fcec88b6ef34
remote_repo: https://github.com/fake_org/fake_repo
read: []
write:
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/*
description: A csv file
file_type: csv
use:
version: 1.0.0
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/thing/1
description: A csv file
file_type: csv
use:
version: 1.0.0
- data_product: real/data/e8d7af00c8f8e24c2790e2a32241bc1bfc8cf011/1
description: A csv file
file_type: csv
use:
version: 1.0.0
```
