Title: | Query the 'UniProtKB' REST API |
---|---|
Description: | Retrieve protein information from the 'UniProtKB' REST API (see <https://www.uniprot.org/help/api_queries>). |
Authors: | Guillaume Voisinne [aut, cre] |
Maintainer: | Guillaume Voisinne <[email protected]> |
License: | GPL-3 |
Version: | 1.0.5 |
Built: | 2025-02-27 04:02:46 UTC |
Source: | https://github.com/voisinneg/queryup |
Accessory function used to build the query url
build_query_url( query = NULL, base_url = "https://rest.uniprot.org/uniprotkb/", columns = c("accession", "id", "gene_names", "organism_name", "reviewed"), format = "json" )
build_query_url( query = NULL, base_url = "https://rest.uniprot.org/uniprotkb/", columns = c("accession", "id", "gene_names", "organism_name", "reviewed"), format = "json" )
query |
list of keys corresponding to UniProt's query fields. For example : list("gene_exact" = c("Pik3r1", "Pik3r2") , "organism" = c("10090", "9606"), "reviewed" = "yes") |
base_url |
The base url for the UniProt REST API |
columns |
names of UniProt data columns to retrieve. |
format |
format of the response provided by the UniProt API |
the query url
Accessory function removing invalid values from a query
clean_query(query, df)
clean_query(query, df)
query |
list of keys corresponding to UniProt's query fields. For example : list("gene_exact" = c("Pik3r1", "Pik3r2") , "organism" = c("10090", "9606"), "reviewed" = "yes") |
df |
data.frame with invalid values (in column "value") and corresponding query field (in column "field"). |
the input query without the invalid values
Retrieve data from UniProt using UniProt's REST API
get_uniprot_data( query = NULL, base_url = "https://rest.uniprot.org/uniprotkb/", columns = c("accession", "id", "gene_names", "organism_id", "reviewed") )
get_uniprot_data( query = NULL, base_url = "https://rest.uniprot.org/uniprotkb/", columns = c("accession", "id", "gene_names", "organism_id", "reviewed") )
query |
list of keys corresponding to UniProt's query fields. For example : list("gene_exact" = c("Pik3r1", "Pik3r2") , "organism" = c("10090", "9606"), "reviewed" = "yes"). See 'query_fields' for available query fields. |
base_url |
The base url for the UniProt REST API |
columns |
names of UniProt data columns to retrieve. Examples include "accession", "id", "gene_names", "keyword", "sequence". See 'return_fields' for available return fields. |
a list with the following items :
the query url
the http status code for the request
messages returned by the REST API
a data.frame containing the query results
#Getting gene names, keywords and protein sequences for a set of UniProt IDs. ids <- c("P22682", "P47941") cols <- c("accession", "id", "gene_names", "keyword", "sequence") query = list("accession_id" = ids) df <- get_uniprot_data(query = query, columns = cols)$content df
#Getting gene names, keywords and protein sequences for a set of UniProt IDs. ids <- c("P22682", "P47941") cols <- c("accession", "id", "gene_names", "keyword", "sequence") query = list("accession_id" = ids) df <- get_uniprot_data(query = query, columns = cols)$content df
Accessory function retrieving invalid values from messages returned by the UniProt API.
parse_messages(messages)
parse_messages(messages)
messages |
character string containing the error messages returned by UniProt API |
a data.frame with invalid values (in column "value") and corresponding query field (in column "field"). NULL if no invalid values are identified.
Query fields that can be used to generate queries using 'queryup' along with associated examples and description.
query_fields
query_fields
A data frame with 44 rows and 3 variables:
Name of the query field
Example query (as appearing in the query url)
Description of the example query
https://www.uniprot.org/help/query-fields
Retrieve data from UniProt using UniProt's REST API.
To avoid non-responsive queries, they are split into
smaller queries with at most max_keys
items per query field.
Not that it works only with queries where items within query fields are
collapsed with '+OR+' and different
query fields are collapsed with '+AND+' (see query_uniprot()
)
query_uniprot( query = NULL, base_url = "https://rest.uniprot.org/uniprotkb/", columns = c("accession", "id", "gene_names", "organism_id", "reviewed"), max_keys = 200, updateProgress = NULL, show_progress = TRUE )
query_uniprot( query = NULL, base_url = "https://rest.uniprot.org/uniprotkb/", columns = c("accession", "id", "gene_names", "organism_id", "reviewed"), max_keys = 200, updateProgress = NULL, show_progress = TRUE )
query |
list of keys corresponding to UniProt's query fields. For example : query = list("gene_exact" = c("Pik3r1", "Pik3r2"), "organism_id" = c("10090", "9606"), "reviewed" = "true"). See 'query_fields' for available query fields. |
base_url |
The base url for the UniProt REST API |
columns |
names of UniProt data columns to retrieve. Examples include "accession", "id", "genes", "keywords", "sequence". See 'return_fields' for available return fields. |
max_keys |
maximum number of field items submitted |
updateProgress |
used to display progress in shiny apps |
show_progress |
Show progress bar |
a data.frame
# Get the UniProt entries of all proteins encoded by gene Pik3r1 ids <- c("P22682", "P47941") query = list("accession_id" = ids) df <- query_uniprot(query = query) head(df)
# Get the UniProt entries of all proteins encoded by gene Pik3r1 ids <- c("P22682", "P47941") query = list("accession_id" = ids) df <- query_uniprot(query = query) head(df)
Return fields that can be retrieved using 'queryup' along with their label (column "Label") as appearing in the retrieved data.frame.
return_fields
return_fields
A data frame with 287 rows and 2 variables:
Name of the returned field
Label of the corresponding column in the retrieved data.frame
https://www.uniprot.org/help/return_fields
Entry names and other attributes of 1000 UniProt entries in Mus musculus.
uniprot_entries
uniprot_entries
A data frame with 1000 rows and 5 variables:
UniProt entry accession id
UniProt entry name
Gene names
Taxon ID
Swiss-Prot review status