Nonparametric estimation of survival and sub-distribution functions in event history analysis. The package provides extensions to competing risks, multi-state models and interval censored observation. The model specification is adapted from the survival package.
The function `prodlim' extends the function `survfit' of the survival package as follows:
R (>= 1.9.1), KernSmooth
`Hist' is short for `History'. The function generalizes the function `Surv' of the survival package to competing risks models and mixed case interval censored data.
In a classical survival setting where the status variable takes only two values (dead or alive=, then `Surv' and `Hist' provide more or less the same:
library(survival) library(prodlim) data(pbc) help(pbc) pbc.Hist <- Hist(pbc$time,pbc$status) pbc.Surv <- Surv(pbc$time,pbc$status) all(unclass(pbc.Hist)==unclass(pbc.Surv))
library(survival) library(prodlim) data(pbc) help(pbc) pbc.Hist <- Hist(pbc$time,pbc$status) pbc.Surv <- Surv(pbc$time,pbc$status) all(unclass(pbc.Hist)==unclass(pbc.Surv))
However, the attributes of the objects and the print/summary functionality are different. Furthermore, there is a plot method for Hist objects.
## Two-state survival models
summary(pbc.Surv)
summary(pbc.Hist)
plot(pbc.Hist)
plot(pbc.Hist,state.lab=c("Alive","Dead"),style="count")
## two competing risks
comprisk.model <- data.frame(time=1:3,status=1:3)
CRHist <- with(comprisk.model,Hist(time,status,cens.code=2))
plot(CRHist,
cex=2,
state.lab=c("Alive","Dead\n cause 1","Dead\n cause 2"),
style="character",ybox.rule=1,
arrow.lab=c(expression(gamma[1](t),gamma[2](t))),
enumerate.boxes=TRUE,
cex.boxlabs = 1.28)
## illness-death models
illness.death.frame <-
data.frame(time=1:4,
from=c("Disease-free","Disease-free",
"Diseased","Disease-free"),
to=c("0","Diseased","Dead","Dead"))
IDHist <- with(illness.death.frame,
Hist(time,event=list(from,to)))
plot(IDHist,
ybox.rule=4,
xbox.rule=.3,
state.cex=1.3,
enum=TRUE,
arrow.lab.side=c(-1,-1,1))
See the help page of plot.Hist for more examples.
The Kaplan-Meier estimator for the survival function of the event times.
pbc.survfit <- survfit(Surv(time,status)~1,data=pbc) pbc.prodlim <- prodlim(Hist(time,status)~1,data=pbc) all(round(pbc.prodlim$surv,12)==round(pbc.survfit$surv,12))
We get same result except for some strange rounding errors. It seems that the discrepancy is due to different definitions of the status variable in the c-routines (either as `integer' or as `double').
The so-called reverse Kaplan-Meier estimator for the survival function of the censoring times. The estimator can usually NOT be only be obtained by changing from status to 1-status. With one exception: when there are NO ties between event times and censoring times.
To compute the reverse Kaplan-Meier specify `reverse=TRUE':
pbc.survfit.cens <- survfit(Surv(time,1-status)~1,data=pbc) pbc.prodlim.cens <- prodlim(Surv(time,status)~1,data=pbc,reverse=TRUE) all(round(pbc.prodlim.cens$surv,2)==round(pbc.survfit.cens$surv,2)) [1] TRUE all(round(pbc.prodlim.cens$surv,3)==round(pbc.survfit.cens$surv,3)) [1] FALSE
The conditional kernel smoothed Kaplan-Meier for continuous covariates as defined by Beran (1981). Currently only one continuous covariate is allowed. For each value of the covariate the method finds a symmetric nearest neighbourhood (NN) and computes the ordinary Kaplan-Meier using only the subset of the data which corresponds to individuals in this neighbourhood.
To compute the conditional Kaplan-Meier estimator with respect to `age' in the pbc example put `NN(age)' in the right hand side of the formula.
require(KernSmooth) neighborhood(pbc$age) pbc.prodlim.age <- prodlim(Surv(time,status)~NN(age),data=pbc) plot(pbc.prodlim.age)
SimSurv(N= 20, surv = list(dist="rgomp", args = list(shape=1, scale=2), baseline = 1/10, link = "exp", coef = 0.15, transform = NULL, method = "simulation"), cens = NULL, cova = list(X1=list("rnorm",mean=0,sd=0)))
library(pec)
R (>= 1.9.1), prodlim, Design, survival
dat <- data.frame(time=round(rexp(300,.2),2),status=rbinom(300,1,.6))
dat$X <- dat$time+round(rnorm(300,2,3),2)
models <- list("Kaplan.Meier"=survfit(Surv(time,status)~1,data=dat),
"Cox"=coxph(Surv(time,status)~X,data=dat))
pred.error <- pec(object=models,
formula=Surv(time,status)~X,
data=dat,
exact=TRUE,
cens.model="marginal",
replan="none",
B=0,
verbose=TRUE)
print(pred.error)
summary(pred.error)
plot(pred.error,xlim=c(0,30))
bootPE <- pec(object=models,
formula=Surv(time,status)~X,
data=dat,
exact=TRUE,
cens.model="marginal",
replan="boot632plus",
B=100,
keep.matrix=TRUE,
verbose=TRUE)
print(bootPE)
summary(bootPE)
plot(bootPE)