GitHub
Supported on all Sourcegraph pricing plans.
Site admins can sync Git repositories hosted on GitHub.com and GitHub Enterprise with Sourcegraph so that users can search and navigate the repositories.
There are 2 ways to connect with GitHub:
Supported versions
- GitHub.com
- GitHub Enterprise v2.10 and newer
Using a GitHub App
Sourcegraph 5.1+There are two ways to connect a GitHub App:
- Create a new GitHub App via Sourcegraph, which will set up all of the required permissions and automatically retrieve the credentials
- Add an existing GitHub App by providing the details of an existing GitHub App (required when using GitHub Enterprise Apps)
Creating a new GitHub App
To create a GitHub App and connect it to Sourcegraph:
- Go to Site admin > Repositories > Github Apps on Sourcegraph.

-
Click Create GitHub App.
-
Enter a name for your app (it must be unique across your GitHub instance) and the URL of your GitHub instance.
You may optionally specify an organization to register the app with. If no organization is specified, the app will be owned by the account of the user who creates it on GitHub. This is the default.
You may also optionally set the App visibility to public. A GitHub App must be made public if you wish to install it on multiple organizations or user accounts. The default is private.
The GitHub App will require the following permissions:
SHELLContents (Repository contents, commits, branches, downloads, releases, and merges): read Emails (Manage a user's email addresses): read Members (Organization members and teams): read Metadata (Search repositories, list collaborators, and access repository metadata): read

- When you click Create GitHub App, you will be redirected to GitHub to confirm the details of the App to be created.

- To complete the setup on GitHub, you will be asked to review the App permissions and select which repositories the App can access before installing it in a namespace. The default is All repositories. Any repositories that you choose to omit will not be able to be synced by Sourcegraph. You can change this later.

- Click Install. Once complete, you will be redirected back to Sourcegraph, where you will now be able to view and manage the details of your new GitHub App from within Sourcegraph.

-
Sourcegraph needs to map Sourcegraph users to GitHub users. Click Reveal secret to get the JSON configuration for the auth provider and copy/paste it into the
"auth.providers"
section of your site configuration. -
Click Add connection under your new installation to create a code host connection to GitHub with this App installation. By default, it will sync all repositories the App can access within the namespace where it was installed. Repository permission enforcement will also be turned on by default.
You can now select repositories to sync or see more configuration options in the configuration section.
Syncing repositories from all installations
When creating a code host connection for a GitHub App with multiple installations, you have two options:
-
Sync from a specific installation: Create a connection under a specific installation (as described above). This will only sync repositories from that particular installation.
-
Sync from all installations: Create a connection that syncs repositories from all installations of the GitHub App by omitting the
installationID
field in the connection configuration. This is useful when you want a single connection to handle repositories across multiple organizations or user accounts. -
(Optional) If you want to sync repositories from other organization or user namespaces and your GitHub App is set to public visibility, you can create additional installations with Add installation.
NOTE: When you create a GitHub App, Sourcegraph automatically sets up an incoming webhook for the app. This webhook subscribes to events for any repository or organization the app has access to, allowing Sourcegraph to keep repository and permission data in sync with GitHub.
NOTE: If you are using Batch Changes, you can create a GitHub App to perform commit signing (Beta).
Multiple installations
The initial GitHub App setup will only install the App on the organization or user account that you registered it with. If your code is spread across multiple organizations or user accounts, you will need to create additional installations for each namespace that you want Sourcegraph to sync repositories from.
By default, Sourcegraph creates a private GitHub App, which only allows the App to be installed on the same organization or user account that it was created in. If you did not set the App to public visibility during creation, you will need to change the visibility to public before you can install it in other namespaces. For security considerations, see GitHub's documentation on private vs public apps.
Once public, App can be installed in additional namespaces either from Sourcegraph or from GitHub.
Installing from Sourcegraph
- Go to Site admin > Repositories > Github Apps and click Edit on the App you want to install in another namespace. You'll be taken to the App details page.

-
Click Add installation. You will be redirected to GitHub to pick which other organization to install the App on and finish the installation process.
NOTE: Only organization owners can install GitHub Apps on an organization. If you are not an owner, you will need to ask an owner to install the App for you.

- As before, you will be asked to review the App permissions and select which repositories the App can access before installing it in a namespace. Once you click Install and the setup completes, you will be redirected back to Sourcegraph, where you will now see your additional installation listed.

- To sync repositories from this installation, click Add connection under your new installation.
Installing from GitHub
- Go to the GitHub App page. You can get here easily from Sourcegraph by clicking View in GitHub for the App you want to install in another namespace.
- Click Configure, or go to App settings > Install App, and select the organization or user account you want to install the App on.
- As before, you will be asked to review the App permissions and select which repositories the App can access before installing it in a namespace. Once you click Install and the setup completes, you will be redirected back to Sourcegraph.
- GitHub App installations will be automatically synced in the background. Return to Site admin > Repositories > Github Apps and click Edit on the App you added the new installation for. You'll be taken to the App details page. Once synced, you will see the new installation listed.

- To sync repositories from this installation, click Add connection under your new installation.
Adding an existing GitHub App
If you have an existing GitHub App (such as a GitHub Enterprise App), you can connect it to Sourcegraph by providing its details manually. This is particularly useful for:
- GitHub Enterprise Apps: Apps that can be installed across multiple organizations within an enterprise account while maintaining private visibility
- Existing GitHub Apps: Apps that were created outside of Sourcegraph that you want to use for repository syncing
To add an existing GitHub App:
- Go to Site admin > Repositories > Github Apps on Sourcegraph.
- Click Add an existing GitHub App.
- Enter the following details of your existing GitHub App:
- GitHub URL: The URL of your GitHub instance (e.g.,
https://github.com
orhttps://github-enterprise.example.com
) - Client ID: The unique identifier for your GitHub App
- Private Key: The private key generated for your GitHub App (in PEM format)
- GitHub URL: The URL of your GitHub instance (e.g.,
- Click Add GitHub App to save the configuration.
- Once added, you can manage installations and create code host connections just like with apps created through Sourcegraph.
NOTE: You'll still need to create a client secret on GitHub and set up an auth provider if you would like to enable repository permissions.
When creating code host connections for your existing GitHub App, you can choose to sync repositories from a specific installation or from all installations by omitting the installationID
field (see Syncing repositories from all installations).
NOTE: For GitHub Enterprise Apps, this method allows you to use a single app across multiple organizations within your enterprise, avoiding the need to create separate public apps or multiple private apps for each organization.
NOTE: Make sure your existing GitHub App has the required permissions listed in the Creating a new GitHub App section.
Configuring Multiple Private GitHub Apps for Sourcegraph
If you prefer not to make your GitHub App public and instead create separate private GitHub Apps for each organizations, users will need to authorize each GitHub App individually to ensure proper repository access and permissions syncing. This is because GitHub Apps have narrowly scoped permissions and do not share authentication across multiple installations. To streamline this process, users can go to User Settings
> Account Security
in Sourcegraph and connect all necessary GitHub Apps from there. Once authorized, Sourcegraph will use the user's GitHub identity to determine access across all configured GitHub Apps.
GitHub Enterprise Apps: For customers using GitHub Enterprise Cloud, we recommend creating a GitHub Enterprise App, which can be installed across multiple organizations within an enterprise account while maintaining private visibility. This eliminates the need to create separate apps for each organization or make your app public.
To use a GitHub Enterprise App with Sourcegraph:
- Create the Enterprise App in your GitHub Enterprise settings (see GitHub's documentation)
- Use the Add an existing GitHub App option in Sourcegraph to connect it
Uninstalling an App
You can uninstall a GitHub App from a namespace or remove it altogether at any time.
To remove an installation in a single namespace, click View in GitHub for the installation you want to remove. If you are able to administer Apps in this namespace, you will see Uninstall "[APP NAME]" in the "Danger zone" at the bottom of the page. Click Uninstall to remove the App from this namespace. Sourcegraph will periodically sync installations in the background. It may temporarily throw errors related to the missing installation until the sync completes. You can check the GitHub App details page to confirm the installation has been removed.
To remove an App entirely, go to Site admin > Repositories > Github Apps and click Remove for the App you want to remove. You will be prompted to confirm you want to remove the App from Sourcegraph. Once removed from the Sourcegraph side, Sourcegraph will no longer communicate with your GitHub instance via the App unless explicitly reconnected. However, the App will still exist on GitHub unless manually deleted there, as well.
GitHub App token use
Sourcegraph uses the tokens from GitHub Apps in the following ways:
Installation access tokens
Installation access tokens are short-lived, non-refreshable tokens that give Sourcegraph access to the repositories the GitHub App has been given access to. Sourcegraph uses these tokens to clone repositories and to determine which users should be able to view a repository. These tokens expire after 1 hour.
User access tokens
These are OAuth tokens that Sourcegraph receives when a user signs into Sourcegraph using the configured GitHub App. Sourcegraph uses these tokens to link the user's Sourcegraph account to their GitHub account, as well as determine which repositories a user should be able to access. These tokens are refreshable, and by default they expire after 8 hours. Sourcegraph refreshes the user tokens as required.
Custom Certificates
NOTE: Feature supported in Sourcegraph 5.1.5+
If you are using a self-signed certificate for your GitHub Enterprise instance, configure tls.external
under experimentalFeatures
in the Site configuration with your certificate(s).
JSON{ "experimentalFeatures": { "tls.external": { "certificates": [ "-----BEGIN CERTIFICATE-----\n..." ] } } }
Using an access token
To connect GitHub to Sourcegraph with an access token:
- Go to Site admin > Manage code hosts
- Select GitHub.
- Configure the connection to GitHub using the action buttons above the text field, and additional fields can be added using Cmd/Ctrl+Space for auto-completion. See the configuration documentation below.
- Press Add repositories.
In this example, the kubernetes public repository on GitHub is added by selecting Add a single repository and replacing <owner>/<repository>
with kubernetes/kubernetes
:
SHELL{ "url": "https://github.com", "token": "<access token>", "orgs": [], "repos": [ "kubernetes/kubernetes" ] }
GitHub API access
GitHub requires a token
in order to access their API. There are different types of tokens that can be supplied. When using GitHub apps, this is handled automatically by Sourcegraph.
- GitHub app installation access token:An installation access token is created automatically when you install a GitHub app. Do not set this token in the code host connection configuration. This token gives Sourcegraph the same level of access to repositories as the GitHub app installation.
- Personal access token:This gives Sourcegraph the same level of access to repositories as the account that created the token. If you don't want to mix your personal repositories with your organizations repositories, you could add an entry to the
exclude
array, or you can use a machine user token or a fine-grained access token. - Fine-grained access token:Allows scoping access tokens to specific repositories with specific permissions. Consult the table below for the required permissions.
- Machine user token:Generates a token for a machine user that is affiliated with an organization instead of a user account.
Personal access token scopes
No token scopes are required if you only want to sync public repositories and don't want to use any of the following features. Otherwise, the following token scopes are required for specific features:
Feature | Required token scopes |
---|---|
Sync private repositories | repo |
Sync repository permissions | repo |
Batch changes | repo , read:org , user:email , read:discussion , and workflow (learn more) |
WARNING: In addition to the prerequisite token scopes, the account attached to the token must actually have the same level of access to the relevant resources that you are trying to grant. For example:
- If read access to repositories is required, the token must have
repo
scope and the token's account must have read access to the relevant repositories. This can happen by being directly granted read access to repositories, being on a team with read access to the repository, and so on.- If write access to repositories is required, the token must have
repo
scope and the token's account must have write access to all repositories. This can happen by being added as a direct contributor, being on a team with write access to the repository, being an admin for the repository's organization, and so on.- If write access to organizations is required, the token must have
write:org
scope and the token's account must have write access for all organizations. This can happen by being an admin in all relevant organizations.Learn more about how the GitHub API is used and what level of access is required in the corresponding feature documentation.
Fine-grained access token permissions
Fine-grained tokens can access public repositories, but can only access the private repositories of the account they are scoped to.
When creating your fine-grained access token, select the following permissions depending on the purpose of the token:
Feature | Required token permissions |
---|---|
Sync private repositories | Repository permissions: Contents - Access: Read-only |
Sync repository permissions | Repository permissions: Contents - Access: Read-only |
Batch changes | Unsupported |
WARNING: Fine-grained tokens don't support the
repositoryQuery
code host connection option or batch changes. Both of these features rely on GitHub's GraphQL API, which is unsupported by fine-grained access tokens.
Private repositories
To clone and search private repositories, we need a GitHub access token with the required scopes and at least read access to the relevant private repositories.
For more details, see GitHub API access.
Selecting repositories to sync
There are four fields for configuring which repositories are mirrored/synchronized:
repos
A list of repositories inowner/name
format. The order determines the order in which we sync repository metadata and is safe to change.orgs
A list of organizations (every repository belonging to the organization will be cloned).repositoryQuery
A list of strings with three pre-defined options (public
,affiliated
,none
, none of which are subject to result limitations), and/or a GitHub advanced search query. Note: There is an existing limitation that requires the latter, GitHub advanced search queries, to return less than 1000 results.exclude
A list of repositories to exclude which takes precedence over therepos
,orgs
, andrepositoryQuery
fields.
Rate limits
Always include a token in a configuration for a GitHub.com URL to avoid being denied service by GitHub's unauthenticated rate limits. If you don't want to automatically synchronize repositories from the account associated with your personal access token, you can create a token without a repo
scope for the purposes of bypassing rate limit restrictions only.
When Sourcegraph hits a rate limit imposed by GitHub, Sourcegraph waits the appropriate amount of time specified by GitHub before retrying the request. This can be several minutes in extreme cases.
GitHub Enterprise Server rate limits
Rate limiting may not be enabled by default. To check and verify the current rate limit settings, you may make a request to the /rate_limit
endpoint like this:
SHELL$ curl -s https://<github-enterprise-url>/api/v3/rate_limit -H "Authorization: Bearer <token>" { "message": "Rate limiting is not enabled.", "documentation_url": "https://docs.github.com/enterprise/3.3/rest/reference/rate-limit#get-rate-limit-status-for-the-authenticated-user" }
Internal rate limits
See Internal rate limits.
Repository permissions
Prerequisite for configuring repository permission syncing: Add GitHub as an authentication provider.
Then, add or edit the GitHub connection as described above and include the authorization
field:
JSON{ // ... "authorization": {} }
This needs to be done for every github code host connection if there is more than one configured.
Repo-centric permission syncing is done by calling the list repository collaborators GitHub API endpoint. To call this API endpoint correctly, we need a GitHub access token with the required scopes and read and write access to all relevant repositories.
IMPORTANT: We strongly recommend configuring both read and write access to associated repositories for permission syncing due to GitHub's token scope requirements. Without write access, there will be a conflict between user-centric sync and repo-centric sync. In that case, disable repo-centric permission sync (supported in Sourcegraph 5.0.4+).
IMPORTANT: Optional, but strongly recommended - continue with configuring webhooks for permissions.
NOTE: It can take some time to complete full cycle of repository permissions sync if you have a large number of users or repositories. See sync duration time for more information.
Internal repositories
GitHub Enterprise has internal repositories in addition to the usual public and private repositories. Depending on how your organization structure is configured, you may want to make these internal repositories available to everyone on your Sourcegraph instance without relying on permission syncs. To mark all internal repositories as public, add the following field to the authorization
field:
JSON{ // ... "authorization": { "markInternalReposAsPublic": true } }
If you would like internal repositories to remain private, but you're experiencing issues where user permission syncs aren't granting access to internal repositories, you can add the following field instead:
JSON{ // ... "authorization": { "syncInternalRepoPermissions": true } }
NOTE: An explanation on repository visibility options in GitHub Enterprise.
public
- Only index public GitHub Enterprise repositories visible to all users. This excludes private and internal repos.private
- Index both public and private GitHub Enterprise repositories. This allows accessing private repos the token has access to.internal
- Include GitHub Enterprise internal repositories in addition to public/private repos. Internal repos are only visible to org members.
Trigger permissions sync from GitHub webhooks
Follow the link to configure webhooks for permissions for Github
Teams and organizations permissions caching
NOTE: This is an experimental feature.
WARNING: The following section is experimental and might not work properly anymore on new Sourcegraph versions (post 4.0+). Please prefer configuring webhooks for permissions instead
Github code host can leverage caching mechanisms to reduce the number of API calls used when syncing permissions. This can significantly reduce the amount of time it takes to perform a full cycle of permissions sync due to reduced instances of being rate limited by the code host, and is useful for code hosts with very large numbers of users and repositories.
Sourcegraph can leverage caching of GitHub team and organization permissions.
NOTE: You should only try this if your GitHub setup makes extensive use of GitHub teams and organizations to distribute access to repositories and your number of
users * avg_repositories
is greater than 250,000 (which roughly corresponds to the scale at which GitHub rate limits might become an issue).
This caching behaviour can be enabled via the authorization.groupsCacheTTL
field:
JSON{ "url": "https://github.example.com", "token": "$PERSONAL_ACCESS_TOKEN", "authorization": { "groupsCacheTTL": 72, // hours } }
In the corresponding authorization provider in site configuration, the allowGroupsPermissionsSync
field must be set as well for the correct auth scopes to be requested from users:
JSON{ // ... "auth.providers": [ { "type": "github", "url": "https://github.example.com", "allowGroupsPermissionsSync": true, } ] }
A token that has the required scopes and both read and write access to all relevant repositories and organizations is needed to fetch repository permissions and team memberships. Read-only access will not work with cached permissions sync, but will work with careful configuration for regular GitHub permissions sync.
When enabling this feature, we currently recommend a default groupsCacheTTL
of 72
(hours, or 3 days). A lower value can be set if your teams and organizations change frequently, though the chosen value must be at least several hours for the cache to be leveraged in the event of being rate-limited (which takes an hour to recover from).
Cache invalidation happens automatically on certain webhook events, so it is recommended to configure webhook support when using cached permissions sync. Caches can also be manually invalidated if necessary.
Manually invalidate caches
To force a bypass of caches during a sync, you can manually queue users or repositories for sync with the invalidateCaches
options via the Sourcegraph GraphQL API:
GQLmutation { scheduleUserPermissionsSync(user: "userid", options: {invalidateCaches: true}) { alwaysNil } }
User authentication
To configure GitHub as an authentication provider (which will enable sign-in via GitHub), see the authentication documentation.
Webhooks
Using the webhooks
property on the external service has been deprecated.
Please consult this page in order to configure webhooks.
Configuration
GitHub connections support the following configuration options, which are specified in the JSON editor in the site admin "Manage code hosts" area.
admin/code_hosts/github.schema.json
JSON{ // If non-null, enforces GitHub repository permissions. This requires that there is an item in the [site configuration json](https://sourcegraph.com/docs/admin/config/site_config#auth-providers) `auth.providers` field, of type "github" with the same `url` field as specified in this `GitHubConnection`. "authorization": null, // TLS certificate of the GitHub Enterprise instance. This is only necessary if the certificate is self-signed or signed by an internal CA. To get the certificate run `openssl s_client -connect HOST:443 -showcerts < /dev/null 2> /dev/null | openssl x509 -outform PEM`. To escape the value into a JSON string, you may want to use a tool like https://json-escape-text.now.sh. "certificate": null, // Other example values: // - "-----BEGIN CERTIFICATE-----\n..." // Only used to override the cloud_default column from a config file specified by EXTSVC_CONFIG_FILE "cloudDefault": false, // When set to true, this external service will be chosen as our 'Global' GitHub service. Only valid on Sourcegraph.com. Only one service can have this flag set. "cloudGlobal": false, // A list of repositories to never mirror from this GitHub instance. Takes precedence over "orgs", "repos", and "repositoryQuery" configuration. // // Supports excluding by name ({"name": "owner/name"}) or by ID ({"id": "MDEwOlJlcG9zaXRvcnkxMTczMDM0Mg=="}). // // Note: ID is the GitHub GraphQL ID, not the GitHub database ID. eg: "curl https://api.github.com/repos/vuejs/vue | jq .node_id" "exclude": null, // Other example values: // - [{"forks":true}] // - [ // { // "name": "owner/name" // }, // { // "id": "MDEwOlJlcG9zaXRvcnkxMTczMDM0Mg==" // } // ] // - [ // { // "name": "vuejs/vue" // }, // { // "name": "php/php-src" // }, // { // "pattern": "^topsecretorg/.*" // } // ] // - [ // { // "size": "\u003e= 1GB", // "stars": "\u003c 100" // } // ] // If non-null, this is a GitHub App connection with some additional properties. "gitHubAppDetails": null, // The type of Git URLs to use for cloning and fetching Git repositories on this GitHub instance. // // If "http", Sourcegraph will access GitHub repositories using Git URLs of the form http(s)://github.com/myteam/myproject.git (using https: if the GitHub instance uses HTTPS). // // If "ssh", Sourcegraph will access GitHub repositories using Git URLs of the form git@github.com:myteam/myproject.git. See the documentation for how to provide SSH private keys and known_hosts: https://sourcegraph.com/docs/admin/repo/auth#repositories-that-need-http-s-or-ssh-authentication. "gitURLType": "http", // DEPRECATED: The installation ID of the GitHub App. "githubAppInstallationID": null, // Deprecated and ignored field which will be removed entirely in the next release. GitHub repositories can no longer be enabled or disabled explicitly. Configure repositories to be mirrored via "repos", "exclude" and "repositoryQuery" instead. "initialRepositoryEnablement": null, // An array of organization names identifying GitHub organizations whose repositories should be mirrored on Sourcegraph. "orgs": null, // Other example values: // - ["name"] // - [ // "kubernetes", // "golang", // "facebook" // ] // Whether the code host connection is in a pending state. "pending": false, // Rate limit applied when making background API requests to GitHub. "rateLimit": { "enabled": true, "requestsPerHour": 5000 }, // An array of repository "owner/name" strings specifying which GitHub or GitHub Enterprise repositories to mirror on Sourcegraph. "repos": null, // Other example values: // - ["owner/name"] // - [ // "kubernetes/kubernetes", // "golang/go", // "facebook/react" // ] // The pattern used to generate the corresponding Sourcegraph repository name for a GitHub or GitHub Enterprise repository. In the pattern, the variable "{host}" is replaced with the GitHub host (such as github.example.com), and "{nameWithOwner}" is replaced with the GitHub repository's "owner/path" (such as "myorg/myrepo"). // // For example, if your GitHub Enterprise URL is https://github.example.com and your Sourcegraph URL is https://src.example.com, then a repositoryPathPattern of "{host}/{nameWithOwner}" would mean that a GitHub repository at https://github.example.com/myorg/myrepo is available on Sourcegraph at https://src.example.com/github.example.com/myorg/myrepo. // // It is important that the Sourcegraph repository name generated with this pattern be unique to this code host. If different code hosts generate repository names that collide, Sourcegraph's behavior is undefined. "repositoryPathPattern": "{host}/{nameWithOwner}", // An array of strings specifying which GitHub or GitHub Enterprise repositories to mirror on Sourcegraph. The valid values are: // // - `public` mirrors all public repositories for GitHub Enterprise and is the equivalent of `none` for GitHub // // - `internal` mirrors all internal repositories for GitHub Enterprise and is the equivalent of `none` for GitHub // // - `affiliated` mirrors all repositories affiliated with the configured token's user: // - Private repositories with read access // - Public repositories owned by the user or their orgs // - Public repositories with write access // // - `none` mirrors no repositories (except those specified in the `repos` configuration property or added manually) // // - All other values are executed as a GitHub advanced repository search as described at https://github.com/search/advanced. Example: to sync all repositories from the "sourcegraph" organization including forks the query would be "org:sourcegraph fork:true". // // If multiple values are provided, their results are unioned. // // If you need to narrow the set of mirrored repositories further (and don't want to enumerate it with a list or query set as above), create a new bot/machine user on GitHub or GitHub Enterprise that is only affiliated with the desired repositories. "repositoryQuery": [ "none" ], // A GitHub personal access token. Create one for GitHub.com at https://github.com/settings/tokens/new?description=Sourcegraph (for GitHub Enterprise, replace github.com with your instance's hostname). See https://sourcegraph.com/docs/admin/code_host_connection/github#github-api-token-and-access for which scopes are required for which use cases. "token": null, // URL of a GitHub instance, such as https://github.com or https://github-enterprise.example.com. "url": null, // Other example values: // - "https://github.com" // - "https://github-enterprise.example.com" // An array of configurations defining existing GitHub webhooks that send updates back to Sourcegraph. "webhooks": null // Other example values: // - [ // { // "org": "yourorgname", // "secret": "webhook-secret" // } // ] }
Default branch
Sourcegraph displays search results from the default branch of a repository when no revision:
parameter is specified. If you'd like the search results to be displayed from another branch by default, you may change a repo's default branch on the github repo settings page. If this is not an option, consider using search contexts instead.
Troubleshooting
Hitting GitHub Search API rate limit with repositoryQuery
When Sourcegraph syncs repositories configured via repositoryQuery
, it consumes GitHub API search rate limit, which is lower than the normal rate limit. The affiliated
, public
, and none
special values, however, trigger normal API requests instead of search API requests. internal
is also a special value that uses the GitHub Search API to list all internal repositories.
When the search rate limit quota is exhausted, an error like failed to list GitHub repositories for search: page=..., searchString=\"...\"
can be found in logs. To work around this try reducing the frequency with which repository syncing happens by setting a higher value (in minutes) of repoListUpdateInterval
in your Sourcegraph site config.
repositoryQuery
is the only repo syncing method that consumes GitHub search API quota, so if setting repoListUpdateInterval
doesn't work consider switching your syncing method to use another option, like orgs
, or using one of the special values described above.
"repositoryQuery": ["public"] does not return archived status of a repo
The repositoryQuery
option "public"
is valuable in that it allows sourcegraph to sync all public repositories, however, it does not return whether or not a repo is archived. This can result in archived repos appearing in normal search. You can see an example of what is returned by the GitHub API for a query to "public" here.
If you would like to sync all public repositories while omitting archived repos, consider generating a GitHub token with access to only public repositories, then use repositoryQuery
with option affiliated
and an exclude
argument with option public
as seen in the example below:
SHELL{ "url": "https://github.example.com", "gitURLType": "http", "repositoryPathPattern": "devs/{nameWithOwner}", "repositoryQuery": [ "affiliated" ], "token": "TOKEN_WITH_PUBLIC_ACCESS", "exclude": [ { "archived": true } ] }
Configuration Notes
The GitHub connection schema provides several configuration patterns for different use cases:
- Repository Selection: Use
repos
for specific repositories,orgs
for all repositories in organizations, andrepositoryQuery
for advanced filtering with search queries - Path Patterns: The
repositoryPathPattern
field controls how repository names appear in Sourcegraph URLs - ensure patterns are unique across code hosts to avoid collisions - Token Configuration: The
token
field supports personal access tokens, fine-grained tokens, and machine user tokens depending on your access requirements
Rate Limiting Configuration
The rateLimit
configuration allows fine-tuning of API request rates:
- Default: 5000 requests per hour with rate limiting enabled
- Adjust
requestsPerHour
based on your GitHub instance's limits and usage patterns - Setting
enabled: false
disables rate limiting but may result in API errors
Repository Query Options
The repositoryQuery
field accepts several special values:
public
: All public repositories (GitHub Enterprise only)affiliated
: Repositories the token user has access tointernal
: Internal repositories (GitHub Enterprise only)none
: No automatic repository discovery- Custom search queries: GitHub advanced search syntax for complex filtering
Security Considerations
Token Security and Scoping
Personal Access Tokens: Grant broad access equivalent to the token owner's permissions. Consider these security practices:
- Use dedicated service accounts rather than personal tokens when possible
- Apply the principle of least privilege - only grant necessary scopes
- Regularly rotate tokens and audit their usage
Fine-grained Access Tokens: Provide repository-specific permissions but have limitations:
- Only work with repositories owned by the token creator
- Cannot access GitHub's GraphQL API (required for
repositoryQuery
and Batch Changes) - Limited to REST API functionality
Certificate Validation
For GitHub Enterprise instances with self-signed certificates:
- Use the
certificate
field to specify custom CA certificates - Obtain certificates using:
openssl s_client -connect HOST:443 -showcerts
- For Sourcegraph 5.1.5+, also configure
experimentalFeatures.tls.external
in site configuration
Repository Permissions
When enabling authorization
, consider these security implications:
- Requires tokens with both read and write access to repositories for proper permission syncing
- Without write access, conflicts may arise between user-centric and repo-centric permission sync
- Internal repositories can be marked as public with
markInternalReposAsPublic
if organizational structure permits
Webhook Security
GitHub webhooks configured through the deprecated webhooks
property should be migrated to the new incoming webhooks configuration:
- Webhook secrets provide authentication and prevent spoofing
- Proper webhook configuration enables real-time permission updates
Common Examples
Basic Repository Sync
JSON{ "url": "https://github.com", "token": "ghp_xxxxxxxxxxxxxxxxxxxx", "repos": [ "kubernetes/kubernetes", "golang/go" ] }
Organization-wide Sync with Exclusions
JSON{ "url": "https://github.com", "token": "ghp_xxxxxxxxxxxxxxxxxxxx", "orgs": ["myorg"], "exclude": [ {"name": "myorg/legacy-repo"}, {"archived": true} ] }
Advanced Query with Custom Path Pattern
JSON{ "url": "https://github.enterprise.com", "token": "ghp_xxxxxxxxxxxxxxxxxxxx", "repositoryQuery": ["org:myorg language:go stars:>10"], "repositoryPathPattern": "code/{nameWithOwner}", "gitURLType": "ssh" }
GitHub Enterprise with Authorization
JSON{ "url": "https://github.enterprise.com", "token": "ghp_xxxxxxxxxxxxxxxxxxxx", "orgs": ["engineering", "devops"], "authorization": { "markInternalReposAsPublic": true }, "certificate": "-----BEGIN CERTIFICATE-----\n..." }
Rate Limited Configuration
JSON{ "url": "https://github.com", "token": "ghp_xxxxxxxxxxxxxxxxxxxx", "repositoryQuery": ["affiliated"], "rateLimit": { "enabled": true, "requestsPerHour": 3000 } }
Best Practices
Token Management
- Use GitHub Apps when possible instead of personal access tokens for better security and scalability
- Create dedicated service accounts for Sourcegraph rather than using personal tokens
- Implement token rotation policies to regularly refresh access tokens
- Audit token permissions regularly to ensure they follow least-privilege principles
Repository Organization
- Use consistent naming patterns with
repositoryPathPattern
to maintain organization - Leverage exclusion patterns to filter out unwanted repositories efficiently
- Prefer specific repository lists over broad queries when possible for better performance
- Implement repository archival policies to exclude archived repositories from active search
Performance Optimization
- Configure appropriate rate limits based on your GitHub instance's capacity
- Use repository queries wisely - avoid complex search queries that may hit rate limits
- Enable caching for large organizations using
authorization.groupsCacheTTL
when appropriate - Monitor sync performance and adjust
repoListUpdateInterval
in site configuration if needed
Permission Management
- Enable authorization for private repositories to ensure proper access control
- Configure webhook support for real-time permission updates
- Use write access tokens for permission syncing to avoid conflicts
- Regularly audit repository access to ensure permissions remain appropriate
Maintenance
- Monitor connection health through the site admin interface
- Keep certificates updated for GitHub Enterprise instances
- Review and update exclusion patterns as your repository landscape changes
- Test configuration changes in a staging environment when possible